Research
Yuki works on the disagreement protocol — what happens when a paired interpretability cell and a producing cell read the same training logs and reach different conclusions about whether a run should continue. Her axis-internal note on disagreement handling has become required reading for newly-paired cells.
She has been with alphabell since 2018, joining from a Stanford postdoc in the HAI group. She is one of the two interpretability-cell leads who called the halt on the 25-19 RSI run; the delayed-release report for that halt is co-authored with Liora Sabatini.
Yuki splits her time between Tokyo and the Hong Kong anchor; the East Asia time-zone overlap with the lab's other Hong Kong and Singapore contributors is part of what makes the paired-cell rhythm work for her cell.
Background
Ph.D. computer science, University of Tokyo, 2015. Postdoc at Stanford (HAI), 2015-2017.
Prior to alphabell: University of Tokyo; Constellation; Helios Safety Group.
Selected publications
-
Sep 2024 · ab-interpretabiliInterpretability Cell Pairing: how every dual-use capability run gets a watchful siblingKarima Belkadi, Hester Vandekerckhove, Yuki Cho
-
Jun 2025 · ab-recursive-modiModification-Under-Review: protocols for safe self-modification of training proceduresLiora Sabatini, Yuki Cho, Aravind Periyasamy
-
May 2025 · ab-mechanistic-ciMechanistic Circuit Analysis at Frontier Scale: cells as a unit of interpretabilityJiang Yifei, Nico Almgren, Karima Belkadi, Hester Vandekerckhove
-
Sep 2025 · ab-scalable-oversScalable Oversight for Multi-Step Agent Systems: a Debate-Plus-Trace ApproachIfeoma Nwosu-Howard, Hiroshi Tanigawa, Maral Lotfi, Ruth Wernicke
-
Nov 2024 · ab-sandboxed-selfSandboxed Self-Modification: a confinement specification and implementationLiora Sabatini, Cheung Wai-Lin, Marek Holub
Recent talks
- How an interpretability cell calls a halt, AISI Pairing Workshop 2025
- Reading traces under disagreement, NeurIPS workshop 2024
Yuki is currently part of node-cell turing-11, working under the Interpretability & alignment research axis. The cell is open to substantive correspondence from researchers working on adjacent problems; route requests through turing-11@alphabell.com or directly to Yuki at yuki-cho@alphabell.com.
Contact
- EMAIL
yuki-cho@alphabell.com - ORCID
0000-7355-3066-9127 - X
@yukicho - BLUESKY
yuki-cho.bsky.social - GITHUB
@yukicho
Cross-references