α · Research · World models

World models.

Predictive generative systems that learn the dynamics of physical, social, and symbolic environments from heterogeneous data — built around the bet that perception and prediction are best treated as a single training objective.

Active cells 5

2025 published reports 5

Axis steward Sasha Petrov (term ends 2026-Q1)

Compute commitment 33% of pool

Position

The world-models axis at alphabell is built on a simple intuition that has not yet been fully cashed out anywhere: most of what we call perception is prediction in disguise, and most of what we call prediction is a derivative of having perceived a coherent environment. The training objectives we use should reflect that. Cells in this axis treat perception and prediction as a single learning objective and ask what falls out.

What falls out, in our 2024-2025 work: representations that transfer non-trivially between modalities, dynamics models that compose, and counterfactual rollouts that are uniformly cheaper to verify than to generate from scratch.

Research threads

Compositional latent dynamics. Latent dynamics composed from a discrete library of learned operators rather than predicted by a monolithic transition model. The CLD result (25/09) is the canonical reference; we are exploring extension to multi-agent settings.
Counterfactual rollout for planning. Confidence-weighted what-if simulations from a world model, used as the planning kernel for cell-hosted production planners. The 30-day deployment study (24/22) is the canonical reference.
Embodied simulation pools. Three cells in this axis maintain a pooled simulation infrastructure totalling 70k+ environments across physics, social, and symbolic dynamics. The pool is shared as a federated resource; cells outside the axis can train on subsets via the scheduler.
Unification of perception and prediction. Single-objective training of representations that handle perception and prediction together — the load-bearing bet behind the 70k-env study (25/07) result.
Robotics pretraining. Sim-to-real transfer from world-model-trained policies to deployed robotics platforms. fermat-31 is the active cell here, currently working with a cable-manufacturing partner.

Methodological position

We do not believe in scaling-only as a research strategy in world models. The 70k-env study showed that data diversity dominates within the scaling regimes we can afford, and that even a marginally better compositional structure pays off more than 4× the data on the benchmarks we run. We have not been able to find a regime where this stops being true.

We are sceptical of evaluation benchmarks that confuse interpolation for prediction. Many of the published world-model benchmarks measure the former. Our cells maintain their own internal benchmark suites with explicit out-of-distribution holdouts; we are open-sourcing the suite incrementally as cells confirm its production-readiness.

Where this connects

To agentic engineering. Plans built on world models are the difference between agents that can survive interruptions and agents that cannot. Most of the long-horizon planning work in agentic cells consumes world models from this axis.

To interpretability. Predictive models are easier to interpret when their state is compositional and their dynamics decompose. The interpretability axis has invested in tooling specifically for world-model-style architectures.

To recursive self-improvement. RSI cells use world models to simulate the consequences of proposed modifications before any candidate change is committed to the actual training procedure. The MUR protocol depends on a credible counterfactual estimator.

Active cells under this axis

voronoi-19

Compositional dynamics

Counterfactual rollout

World models

5 contributors

active

bessel-04

Embodied simulation pool

70k-env study follow-on

World models

7 contributors

active

fermat-31

Robotics pretraining

Sim2real on cable mfg

World models

6 contributors

active

hadamard-08

Cross-modal latent unification

Vision+symbol coherence

World models

4 contributors

active

riemann-44

Time-correlated reward learning

Long-horizon credit assignment

World models

5 contributors

active

All cells →

Publications under this axis

2512.04918
conf

Symbolic World Models for Procedural Reasoning

Dimitri Yelchaninov, Lin Hao, Ananya Mukherjee, Sera Wijewardene

NeurIPS 2025 · alphabell index 25/23

Dec 2025

world models

2508.58450
preprint

Compositional Latent Dynamics for Long-Horizon World Modelling

Jonas Bremer, Sasha Petrov, Felicity Anjali Sandirasegaram, Tomoko Niwa

Internal release — alphabell index 25/09 · arXiv 2508.10912

Aug 2025

world models

2507.05432
conf

Compositional Generalisation in Mixed-Modality World Models

Wen Shao, Søren Almqvist, Tomoko Niwa, Jonas Bremer

ICML 2025 · alphabell index 25/12b

Jul 2025

world models

2507.46060
internal

Embodied Pretraining via Cell-Operated Simulation: a 70k-environment study

Lin Hao, Dimitri Yelchaninov, Sera Wijewardene, Ananya Mukherjee

Internal release — alphabell index 25/07

Jul 2025

world models

2504.04019
conf

Predictive Coding Objectives for Operator Discovery

Jonas Bremer, Tomoko Niwa, Sasha Petrov

ICLR 2025 · alphabell index 25/02b

Apr 2025

world models

2412.93805
internal

Counterfactual Rollouts for Planning: a 30-day deployment study

Sasha Petrov, Maya Quesada, Bilal Hossain

Internal release — alphabell index 24/22

Dec 2024

world models

2412.03998
conf

Adversarial Robustness of Goal-Conditioned World Models

Sasha Petrov, Jonas Bremer, Tomoko Niwa, Maya Quesada

NeurIPS 2024 · alphabell index 24/20

Dec 2024

world models

All publications →