α · Research · Agentic engineering

Agentic engineering.

Agents are not wrappers around language models. They are durable computational entities — with persistent state, learned tool affordances, and verifiable execution traces — and the right level at which to do most of the engineering work alphabell cares about.

Active cells 5
2025 published reports 6
Axis steward Mira Holloway (term ends 2026-Q3)
Compute commitment 21% of pool

Position

The agentic engineering axis at alphabell starts from a position that is now mainstream in some places and contested in others: an agent is a computational entity, not a prompt. We treat its memory, identity, and resource budgets as first-class primitives — exposed by the runtime, written into traces, and managed by the same governance machinery as any other artefact in the lab.

What that means concretely: we don't reinvent state by stuffing it into ever-larger context windows. We don't track an agent's identity through naming conventions in a prompt. We build runtime substrates that expose those primitives to the agent and to the operator, and we evaluate agents at the level of their behaviour over weeks and months, not single trajectories.

Research threads

  • Durable agent substrates. The runtime layer that exposes persistent state, tool affordances, identity, and resource budgets. Substrate v1 was released in 2025-Q4 (paper 25/14). v2 is in design.
  • Long-horizon planning under partial observability. Plans that survive interruptions, environment shifts, and information arriving over weeks rather than seconds. Cells working here intersect with the world-models axis on the question of when a learned environment model is a better planner than a hand-built one.
  • Multi-agent negotiation protocols. When agents with asymmetric goals and capabilities have to reach joint outcomes. The substrate-mediated negotiation result (25/02) is the canonical reference; the joint cell kalman-04 is now consolidating this work.
  • Sandboxed self-modification. The capability for an agent to propose, sandbox, evaluate, and incorporate changes to its own tool catalogue, evaluation criteria, or learned skills. Closely watched, paired with interpretability cells, governed by the modification-under-review (MUR) protocol shared with the RSI axis.
  • Execution-trace verifiability. Every substrate-hosted agent emits a structured, content-addressed trace; downstream consumers (operators, debate-based oversight, paired interpretability cells) can reconstruct any decision from this trace. Used in the debate-plus-trace oversight result (25/12).

Why this is hard

The agentic regime exposes a class of problems that are hard for reasons that are not specific to agents — they show up wherever you take software seriously over long time horizons. A few examples: (1) verifying that a tool an agent learned to use is doing what the agent thinks it is doing; (2) communicating to a downstream operator what an agent has committed itself to that the operator will be obliged to honour; (3) bounding the resource consumption of a long-running agent whose execution path is decided incrementally.

We do not pretend to have solved these. The point of this axis is to build infrastructure where they can be addressed as engineering problems rather than as prompt-engineering folklore.

Where this connects

To interpretability. The execution-trace substrate is also what scalable-oversight (debate-plus-trace) and verification work consumes. The interpretability axis and the agentic axis share tooling.

To recursive self-improvement. The sandboxed self-modification capability and the MUR protocol are co-developed with the RSI axis. The agentic axis owns the substrate; the RSI axis owns the experimentation and the safety protocol applied on top of it.

To world models. Long-horizon planning depends on having predictive models of the environment. Cells in the agentic axis pull world models from the world-models axis the way most projects pull libraries.

Active cells under this axis

fourier-67
Durable agent substrates
Substrate v1 → v2
Agentic eng.
4 contributors
active
kalman-04
Negotiation substrate (joint)
Merged from fourier-67/polya-25
Agentic eng.
7 contributors
active
euler-99
Long-horizon planning + POMDP
Substrate-integrated
Agentic eng.
5 contributors
active
ramanujan-07
Tokenizer + compositional vocab
Lexically-grounded agents
Agentic eng.
4 contributors
active
hopf-23
Multi-agent emergence study
Resources reallocated to kalman-04
Agentic eng.
3 contributors
paused

All cells →

Publications under this axis

All publications →