Embodied Pretraining via Cell-Operated Simulation: a 70k-environment study
Lin Hao, Dimitri Yelchaninov, Sera Wijewardene, Ananya Mukherjee
@techreport{hao2025embodied,
title = {Embodied Pretraining via Cell-Operated Simulation: a 70k-environment study},
author = {Hao, Lin and Yelchaninov, Dimitri and Wijewardene, Sera and Mukherjee, Ananya},
year = {2025},
number = {Internal release — alphabell index 25/07},
institution = {alphabell},
month = {jul},
doi = {10.48550/arXiv.2507.46060},
url = {https://dev.alphabell.com/publications/embodied-pretraining-via-simulation}
}
Abstract
Three cells in the world-models axis pooled their simulation infrastructure into a 70,142-environment training pool drawn from physics, social, and symbolic dynamics. We show that pretraining a 7B-parameter policy on the pool yields a 38% sample-efficiency improvement on every downstream robotics benchmark we evaluated, and — more importantly — produces dynamics-aware representations that transfer non-trivially to symbolic planning tasks. We argue the result supports treating perception and prediction as a single learning objective rather than two.
Index metadata
- Cell
- fermat-31, voronoi-19, hadamard-08
- Compute
- 640 H100-days
- Status
- Open release
- Data
- 70k-env pool index published; full pool available via federated scheduler
- DOI
- 10.48550/arXiv.2507.46060
- arXiv
- arXiv:2507.46060
What this paper is part of
This index entry is part of the World models research axis. The producing cell — bessel-04 — collaborates with adjacent cells listed in the cell directory. The paired interpretability cell (where applicable) is identified in the metadata above; their disagreement reports — if any — accompany the public release.
How to read this
If you want to use the result: the code (where available) is at https://github.com/alphabell-labs/ab-embodied; the dataset is at TBD when one is released. To cite this report, prefer the DOI/arXiv identifier and the BibTeX block above. To discuss this with the producing cell, contact the lab with the index entry slug embodied-pretraining-via-simulation.
Limitations
Each cell-published report carries an explicit limitations section in the internal index. We do not paraphrase it here. Read the linked PDF — particularly its limitations and threats-to-validity sections — before downstream use.
Lin Hao, Dimitri Yelchaninov, Sera Wijewardene, Ananya Mukherjee. Embodied Pretraining via Cell-Operated Simulation: a 70k-environment study. Internal release — alphabell index 25/07, Jul 2025. arXiv:2507.46060. doi:10.48550/arXiv.2507.46060.