α · Research · World models

World models.

Predictive generative systems that learn the dynamics of physical, social, and symbolic environments from heterogeneous data — built around the bet that perception and prediction are best treated as a single training objective.

Active cells 5
2025 published reports 5
Axis steward Sasha Petrov (term ends 2026-Q1)
Compute commitment 33% of pool

Position

The world-models axis at alphabell is built on a simple intuition that has not yet been fully cashed out anywhere: most of what we call perception is prediction in disguise, and most of what we call prediction is a derivative of having perceived a coherent environment. The training objectives we use should reflect that. Cells in this axis treat perception and prediction as a single learning objective and ask what falls out.

What falls out, in our 2024-2025 work: representations that transfer non-trivially between modalities, dynamics models that compose, and counterfactual rollouts that are uniformly cheaper to verify than to generate from scratch.

Research threads

  • Compositional latent dynamics. Latent dynamics composed from a discrete library of learned operators rather than predicted by a monolithic transition model. The CLD result (25/09) is the canonical reference; we are exploring extension to multi-agent settings.
  • Counterfactual rollout for planning. Confidence-weighted what-if simulations from a world model, used as the planning kernel for cell-hosted production planners. The 30-day deployment study (24/22) is the canonical reference.
  • Embodied simulation pools. Three cells in this axis maintain a pooled simulation infrastructure totalling 70k+ environments across physics, social, and symbolic dynamics. The pool is shared as a federated resource; cells outside the axis can train on subsets via the scheduler.
  • Unification of perception and prediction. Single-objective training of representations that handle perception and prediction together — the load-bearing bet behind the 70k-env study (25/07) result.
  • Robotics pretraining. Sim-to-real transfer from world-model-trained policies to deployed robotics platforms. fermat-31 is the active cell here, currently working with a cable-manufacturing partner.

Methodological position

We do not believe in scaling-only as a research strategy in world models. The 70k-env study showed that data diversity dominates within the scaling regimes we can afford, and that even a marginally better compositional structure pays off more than 4× the data on the benchmarks we run. We have not been able to find a regime where this stops being true.

We are sceptical of evaluation benchmarks that confuse interpolation for prediction. Many of the published world-model benchmarks measure the former. Our cells maintain their own internal benchmark suites with explicit out-of-distribution holdouts; we are open-sourcing the suite incrementally as cells confirm its production-readiness.

Where this connects

To agentic engineering. Plans built on world models are the difference between agents that can survive interruptions and agents that cannot. Most of the long-horizon planning work in agentic cells consumes world models from this axis.

To interpretability. Predictive models are easier to interpret when their state is compositional and their dynamics decompose. The interpretability axis has invested in tooling specifically for world-model-style architectures.

To recursive self-improvement. RSI cells use world models to simulate the consequences of proposed modifications before any candidate change is committed to the actual training procedure. The MUR protocol depends on a credible counterfactual estimator.

Active cells under this axis

voronoi-19
Compositional dynamics
Counterfactual rollout
World models
5 contributors
active
bessel-04
Embodied simulation pool
70k-env study follow-on
World models
7 contributors
active
fermat-31
Robotics pretraining
Sim2real on cable mfg
World models
6 contributors
active
hadamard-08
Cross-modal latent unification
Vision+symbol coherence
World models
4 contributors
active
riemann-44
Time-correlated reward learning
Long-horizon credit assignment
World models
5 contributors
active

All cells →

Publications under this axis

All publications →