α · Research · Agentic engineering

Agentic engineering.

Agents are not wrappers around language models. They are durable computational entities — with persistent state, learned tool affordances, and verifiable execution traces — and the right level at which to do most of the engineering work alphabell cares about.

Active cells 5

2025 published reports 6

Axis steward Mira Holloway (term ends 2026-Q3)

Compute commitment 21% of pool

Position

The agentic engineering axis at alphabell starts from a position that is now mainstream in some places and contested in others: an agent is a computational entity, not a prompt. We treat its memory, identity, and resource budgets as first-class primitives — exposed by the runtime, written into traces, and managed by the same governance machinery as any other artefact in the lab.

What that means concretely: we don't reinvent state by stuffing it into ever-larger context windows. We don't track an agent's identity through naming conventions in a prompt. We build runtime substrates that expose those primitives to the agent and to the operator, and we evaluate agents at the level of their behaviour over weeks and months, not single trajectories.

Research threads

Durable agent substrates. The runtime layer that exposes persistent state, tool affordances, identity, and resource budgets. Substrate v1 was released in 2025-Q4 (paper 25/14). v2 is in design.
Long-horizon planning under partial observability. Plans that survive interruptions, environment shifts, and information arriving over weeks rather than seconds. Cells working here intersect with the world-models axis on the question of when a learned environment model is a better planner than a hand-built one.
Multi-agent negotiation protocols. When agents with asymmetric goals and capabilities have to reach joint outcomes. The substrate-mediated negotiation result (25/02) is the canonical reference; the joint cell kalman-04 is now consolidating this work.
Sandboxed self-modification. The capability for an agent to propose, sandbox, evaluate, and incorporate changes to its own tool catalogue, evaluation criteria, or learned skills. Closely watched, paired with interpretability cells, governed by the modification-under-review (MUR) protocol shared with the RSI axis.
Execution-trace verifiability. Every substrate-hosted agent emits a structured, content-addressed trace; downstream consumers (operators, debate-based oversight, paired interpretability cells) can reconstruct any decision from this trace. Used in the debate-plus-trace oversight result (25/12).

Why this is hard

The agentic regime exposes a class of problems that are hard for reasons that are not specific to agents — they show up wherever you take software seriously over long time horizons. A few examples: (1) verifying that a tool an agent learned to use is doing what the agent thinks it is doing; (2) communicating to a downstream operator what an agent has committed itself to that the operator will be obliged to honour; (3) bounding the resource consumption of a long-running agent whose execution path is decided incrementally.

We do not pretend to have solved these. The point of this axis is to build infrastructure where they can be addressed as engineering problems rather than as prompt-engineering folklore.

Where this connects

To interpretability. The execution-trace substrate is also what scalable-oversight (debate-plus-trace) and verification work consumes. The interpretability axis and the agentic axis share tooling.

To recursive self-improvement. The sandboxed self-modification capability and the MUR protocol are co-developed with the RSI axis. The agentic axis owns the substrate; the RSI axis owns the experimentation and the safety protocol applied on top of it.

To world models. Long-horizon planning depends on having predictive models of the environment. Cells in the agentic axis pull world models from the world-models axis the way most projects pull libraries.

Active cells under this axis

fourier-67

Durable agent substrates

Substrate v1 → v2

Agentic eng.

4 contributors

active

kalman-04

Negotiation substrate (joint)

Merged from fourier-67/polya-25

Agentic eng.

7 contributors

active

euler-99

Long-horizon planning + POMDP

Substrate-integrated

Agentic eng.

5 contributors

active

ramanujan-07

Tokenizer + compositional vocab

Lexically-grounded agents

Agentic eng.

4 contributors

active

hopf-23

Multi-agent emergence study

Resources reallocated to kalman-04

Agentic eng.

3 contributors

paused

All cells →

Publications under this axis

2512.00417
preprint

Counterfactual Trajectory Replay for Off-Policy Agent Debugging

Mira Holloway, Priya Anand, Dineth Karunaratne, Akoss Vidor

alphabell index 25/19 · arXiv 2512.00417

Dec 2025

agentic

2510.16245
internal

Durable Agent Substrate v1: persistent state, learned tool affordances, and verifiable execution traces

Mira Holloway, Dineth Karunaratne, Priya Anand, Cheung Wai-Lin

Internal release — alphabell index 25/14

Oct 2025

agentic

2507.10402
conf

Tokenizer Bias in Agentic Decision-Making

Iben Lykke, Mira Holloway, Cheung Wai-Lin

ACL 2025 · alphabell index 25/13

Jul 2025

agentic

2506.04471
conf

Long-Horizon Plan Repair Under Adversarial Environment Shift

Catriona MacLeod, Sho Tachibana, Renata Coello

ICAPS 2025 · alphabell index 25/11

Jun 2025

agentic

2505.06013
conf

Coalition Stability in Substrate-Mediated Negotiation

Roman Iliescu, Yvonne Akande, Lakshmi Ravi, Wenona Tate

AAMAS 2025 · alphabell index 25/04

May 2025

agentic

2504.19425
internal

Negotiation Protocols Among Heterogeneous Agents: a benchmark and three baselines

Roman Iliescu, Yvonne Akande, Lakshmi Ravi, Pascal Niedermeier

Internal release — alphabell index 25/02

Apr 2025

agentic

2411.13633
internal

Sandboxed Self-Modification: a confinement specification and implementation

Liora Sabatini, Cheung Wai-Lin, Marek Holub

Internal release — alphabell index 24/19 · delayed release

Nov 2024

agenticRSI

2410.84851
internal

A Federated Compute Scheduler for an Asynchronous Research Lab

Pranav Iyer, Yusra Habibi, Akoss Vidor

Internal release — alphabell index 24/17

Oct 2024

agentic

All publications →