α · Publications · bounded-self-modification-limits

Bounded Self-Modification: Provable Limits on Agent Self-Editing

Liora Sabatini, Marek Holub, Eitan Berkovich

Axis Recursive self-improvement
Cell godel-02
Published Sep 2025
Venue alphabell index 25/22 · delayed release
Tags RSI

Abstract

We formalise the class of self-modifications that an agent may propose to its own tool catalogue, evaluation criteria, or training procedure, and prove a bound on the rate at which such modifications can compound capability without crossing a pre-registered measurement threshold. The result is constructive: we exhibit a confinement profile under which the bound is tight, and discuss what conditions on the substrate and the modification-under-review protocol make the bound load-bearing. The work is paired with the lab's MUR protocol (25/05).

Index metadata

Cell
godel-02
Compute
redacted
Status
Delayed release — 90-day delay; classified appendices not released
Companion
MUR protocol 25/05; interpretability report ab-int-041
DOI
10.48550/arXiv.2509.10211
arXiv
arXiv:2509.10211

What this paper is part of

This index entry is part of the Recursive self-improvement research axis. The producing cell — godel-02 — collaborates with adjacent cells listed in the cell directory. The paired interpretability cell (where applicable) is identified in the metadata above; their disagreement reports — if any — accompany the public release.

How to read this

If you want to use the result: the code (where available) is at TBD; the dataset is at TBD when one is released. To cite this report, prefer the DOI/arXiv identifier and the BibTeX block above. To discuss this with the producing cell, contact the lab with the index entry slug bounded-self-modification-limits.

Limitations

Each cell-published report carries an explicit limitations section in the internal index. We do not paraphrase it here. Read the linked PDF — particularly its limitations and threats-to-validity sections — before downstream use.

Citation

Liora Sabatini, Marek Holub, Eitan Berkovich. Bounded Self-Modification: Provable Limits on Agent Self-Editing. alphabell index 25/22 · delayed release, Sep 2025. arXiv:2509.10211. doi:10.48550/arXiv.2509.10211.