α · People · Karima Belkadi

Karima Belkadi

Long-tenured contributor

Based in Paris

ORCID 0000-6288-5798-4092

Research

Karima is the interpretability axis steward and the principal author of the mechanistic-circuit-analysis tooling that operates on frontier-class models without quadratic enumeration costs. The result that ~700 reusable circuits explain 86% of behaviourally relevant activations on the lab's benchmark suite is hers; it is one of the more cited results out of the lab in the last two years.

Beyond the headline circuits result, she is the load-bearing author of the operational pairing protocol — the trust model, the artefact pipeline, the disagreement procedure, the escalation channels — that makes interpretability-cell pairing actually function as a check rather than a ceremony.

She convenes the interpretability axis's monthly review pool from Paris. Her axis-stewardship term ends 2026-Q2. She has been with alphabell since 2019 and is widely credited as the person who made the lab's structural commitment to pairing actually operable.

Background

Ph.D. computer science, Sorbonne Université, 2014. M.Sc. Ecole Polytechnique.

Prior to alphabell: École Polytechnique; Halcyon Safety; Cantor Initiative.

Selected publications

May 2025 · ab-mechanistic-ci
Mechanistic Circuit Analysis at Frontier Scale: cells as a unit of interpretability
Jiang Yifei, Nico Almgren, Karima Belkadi, Hester Vandekerckhove
Sep 2024 · ab-interpretabili
Interpretability Cell Pairing: how every dual-use capability run gets a watchful sibling
Karima Belkadi, Hester Vandekerckhove, Yuki Cho
Sep 2025 · ab-scalable-overs
Scalable Oversight for Multi-Step Agent Systems: a Debate-Plus-Trace Approach
Ifeoma Nwosu-Howard, Hiroshi Tanigawa, Maral Lotfi, Ruth Wernicke
Mar 2025 · ab-verifiable-pol
Toward Formal Verification of Learned Policies in Bounded Environments
Aviva Stern, Sun Kyung-min, Felipe Avelar
Nov 2024 · ab-sandboxed-self
Sandboxed Self-Modification: a confinement specification and implementation
Liora Sabatini, Cheung Wai-Lin, Marek Holub

Full publications index →

Recent talks

Circuits at frontier scale, ICLR 2025 (oral)
What pairing actually requires, AISI Pairing Workshop 2025
The 700-circuit conjecture, NeurIPS 2024

Working with

Karima is currently part of node-cell hilbert-13, working under the Interpretability & alignment research axis. The cell is open to substantive correspondence from researchers working on adjacent problems; route requests through hilbert-13@alphabell.com or directly to Karima at karima-belkadi@alphabell.com.

Contact

Cross-references