α · News · 2024-08-22
Tooling

Tooling release: alphabell/oversight-tools v0.4

ab-circuits, ab-trace, ab-debate, and ab-pairs now ship as a unified distribution — `alphabell-oversight` on PyPI, with a first cut of an adapter for non-alphabell substrates.

The 0.4 release of alphabell/oversight-tools is now generally available. This release packages four tools that have, until now, been distributed separately across our open-source repos: ab-circuits (mechanistic interpretability), ab-trace (content-addressed execution traces), ab-debate (debate-based oversight harness), and ab-pairs (paired-cell operational tooling). The release of them as a unified distribution is the result of nine months of operational work to make the toolchain installable by external researchers without requiring familiarity with alphabell's internal infrastructure.

What's new in 0.4 specifically: the unified distribution, the cross-tool data exchange spec (every tool now produces and consumes the same content-addressed trace format), the optional ab-shell REPL that wraps the toolchain in an interactive Python environment, and the first cut of a non-alphabell-substrate adapter that lets external researchers point the tools at their own agent frameworks.

We have been careful, in designing the external adapter, to not promise more than we can deliver. The adapter handles substrate-style agents — those with persistent state, tool catalogues, and execution traces — and does so well. It does not handle prompt-only agents, and it produces poor results when traces are reconstructed after the fact rather than emitted live. We document both limitations in the adapter's README.

Distribution is via PyPI as alphabell-oversight, container images on ghcr.io/alphabell-labs/oversight, and source at github.com/alphabell-labs/oversight-tools. The toolchain is MIT-licensed except for ab-pairs, which is under the lab's research-conduct-charter-compatible licence (the rationale for this exception is documented in the licence text; it is not a restriction on use, but it is a binding on the operational commitments around the tool).

We have heard from several external research groups that the prior version of the toolchain was approximately unusable without alphabell-internal context. The 0.4 release is the one we recommend for external use. We will be running a small set of office-hours sessions over the next quarter for groups adopting the toolchain; details in the README.

What's coming in 0.5 (target end-2024): a partial-preemption story for long-running agents whose execution may need to be paused under federated compute scheduling, a richer disagreement-handling spec for paired cells, and an improved external substrate adapter informed by what we see in the first quarter of external use.

We have been careful, in designing the external adapter, to not promise more than we can deliver.

As always, the toolchain is open to issue reports and pull requests. We do not promise responsiveness on all issues, but we do read everything. Substantive contributions to ab-circuits and ab-trace have come from external researchers in the past, and we expect that to continue.

A note on the migration path. Users of the prior ab-circuits-py, ab-trace-py, and ab-pairs-py packages should not expect a transparent migration: the unified distribution introduces a shared trace format that is not byte-compatible with the prior tools' on-disk representations. We provide a one-shot migration script in the release that converts existing trace stores to the unified format; we have tested the script against the lab's internal trace archive of roughly 11 TB, and it works, but it is single-threaded and slow on stores with many small traces. Plan accordingly. The prior packages remain available on PyPI for one year after this release; we will yank them in August 2025.

We also want to acknowledge the external contributors whose pull requests went into this release. ab-circuits 1.2 includes seventeen distinct external contributors' work across the year leading up to the unification; ab-trace v3 includes nine. Several of those contributors have, separately, expressed interest in being paired with cells; that is an ongoing conversation that the contributor-cohort steward is shepherding.

This release note is signed by babbage-14, the cell that maintains the toolchain. The specific contributors named in CHANGELOG.md should not be read as the only people who worked on the release; the cell as a whole carries the work.


For the protocol details behind anything mentioned above, see /governance and /charter. For the structural commitments, /about.