α · Publications · verifiable-policies-formal-verification

Toward Formal Verification of Learned Policies in Bounded Environments

Aviva Stern, Sun Kyung-min, Felipe Avelar

Axis Interpretability & alignment
Cell lebesgue-22
Published Mar 2025
Venue Internal release — alphabell index 25/01
Tags interp.

Abstract

We present a verification framework for learned policies operating in bounded environments, combining abstract interpretation of the policy network with symbolic execution of the environment dynamics. We prove safety properties of three deployed cells' policies for a class of warehouse-coordination tasks. The framework's applicability is bounded: it requires environments with finite, formally specifiable state. We argue the right ambition is to expand the class of such environments rather than to weaken the proof.

Index metadata

Cell
lebesgue-22
Compute
12 H100-days
Status
Open release
DOI
10.48550/arXiv.2503.73959
arXiv
arXiv:2503.73959

What this paper is part of

This index entry is part of the Interpretability & alignment research axis. The producing cell — lebesgue-22 — collaborates with adjacent cells listed in the cell directory. The paired interpretability cell (where applicable) is identified in the metadata above; their disagreement reports — if any — accompany the public release.

How to read this

If you want to use the result: the code (where available) is at https://github.com/alphabell-labs/ab-verifiab; the dataset is at TBD when one is released. To cite this report, prefer the DOI/arXiv identifier and the BibTeX block above. To discuss this with the producing cell, contact the lab with the index entry slug verifiable-policies-formal-verification.

Limitations

Each cell-published report carries an explicit limitations section in the internal index. We do not paraphrase it here. Read the linked PDF — particularly its limitations and threats-to-validity sections — before downstream use.

Citation

Aviva Stern, Sun Kyung-min, Felipe Avelar. Toward Formal Verification of Learned Policies in Bounded Environments. Internal release — alphabell index 25/01, Mar 2025. arXiv:2503.73959. doi:10.48550/arXiv.2503.73959.