Research
Yifei is a co-author of the mechanistic-circuits-at-frontier-scale result and one of the contributors who developed ab-circuits, the open-source library that the result is bundled with. His specific contribution centres on the circuit-composition machinery — the algorithmic and empirical work that establishes which circuits compose and which do not.
He has been with alphabell since 2019 and works closely with Karima Belkadi on the axis's overall research agenda. He is one of the most active reviewers on the cross-axis methodology review pool and has a habit of asking, on first read, 'what is the smallest circuit that produces this behaviour.'
Yifei is based in Shenzhen and works closely with two universities in the region. He maintains a Chinese-language tutorial sequence on mechanistic interpretability that several Chinese-speaking university courses now use.
Background
Ph.D. computer science, Tsinghua University, 2014. M.Sc. at Peking University.
Prior to alphabell: Tsinghua; Halcyon Safety; Praxis AI Studies.
Selected publications
-
May 2025 · ab-mechanistic-ciMechanistic Circuit Analysis at Frontier Scale: cells as a unit of interpretabilityJiang Yifei, Nico Almgren, Karima Belkadi, Hester Vandekerckhove
-
Sep 2024 · ab-interpretabiliInterpretability Cell Pairing: how every dual-use capability run gets a watchful siblingKarima Belkadi, Hester Vandekerckhove, Yuki Cho
-
Sep 2025 · ab-scalable-oversScalable Oversight for Multi-Step Agent Systems: a Debate-Plus-Trace ApproachIfeoma Nwosu-Howard, Hiroshi Tanigawa, Maral Lotfi, Ruth Wernicke
-
Mar 2025 · ab-verifiable-polToward Formal Verification of Learned Policies in Bounded EnvironmentsAviva Stern, Sun Kyung-min, Felipe Avelar
-
Nov 2024 · ab-sandboxed-selfSandboxed Self-Modification: a confinement specification and implementationLiora Sabatini, Cheung Wai-Lin, Marek Holub
Recent talks
- Circuits as a unit of interpretability, ICLR 2025
- What composes and what doesn't, NeurIPS 2024
Jiang is currently part of node-cell hilbert-13, working under the Interpretability & alignment research axis. The cell is open to substantive correspondence from researchers working on adjacent problems; route requests through hilbert-13@alphabell.com or directly to Jiang at jiang-yifei@alphabell.com.
Contact
- EMAIL
jiang-yifei@alphabell.com - ORCID
0000-8809-0765-9377 - X
@jiangyifei - BLUESKY
jiang-yifei.bsky.social - GITHUB
@jiangyifei
Cross-references