Toward Formal Verification of Learned Policies in Bounded Environments
Aviva Stern, Sun Kyung-min, Felipe Avelar
@techreport{stern2025verifiable,
title = {Toward Formal Verification of Learned Policies in Bounded Environments},
author = {Stern, Aviva and Kyung-min, Sun and Avelar, Felipe},
year = {2025},
number = {Internal release — alphabell index 25/01},
institution = {alphabell},
month = {mar},
doi = {10.48550/arXiv.2503.73959},
url = {https://dev.alphabell.com/publications/verifiable-policies-formal-verification}
}
Abstract
We present a verification framework for learned policies operating in bounded environments, combining abstract interpretation of the policy network with symbolic execution of the environment dynamics. We prove safety properties of three deployed cells' policies for a class of warehouse-coordination tasks. The framework's applicability is bounded: it requires environments with finite, formally specifiable state. We argue the right ambition is to expand the class of such environments rather than to weaken the proof.
Index metadata
- Cell
- lebesgue-22
- Compute
- 12 H100-days
- Status
- Open release
- DOI
- 10.48550/arXiv.2503.73959
- arXiv
- arXiv:2503.73959
What this paper is part of
This index entry is part of the Interpretability & alignment research axis. The producing cell — lebesgue-22 — collaborates with adjacent cells listed in the cell directory. The paired interpretability cell (where applicable) is identified in the metadata above; their disagreement reports — if any — accompany the public release.
How to read this
If you want to use the result: the code (where available) is at https://github.com/alphabell-labs/ab-verifiab; the dataset is at TBD when one is released. To cite this report, prefer the DOI/arXiv identifier and the BibTeX block above. To discuss this with the producing cell, contact the lab with the index entry slug verifiable-policies-formal-verification.
Limitations
Each cell-published report carries an explicit limitations section in the internal index. We do not paraphrase it here. Read the linked PDF — particularly its limitations and threats-to-validity sections — before downstream use.
Aviva Stern, Sun Kyung-min, Felipe Avelar. Toward Formal Verification of Learned Policies in Bounded Environments. Internal release — alphabell index 25/01, Mar 2025. arXiv:2503.73959. doi:10.48550/arXiv.2503.73959.