α · People · Yuki Cho

Yuki Cho

Long-tenured contributor

Based in Tokyo

Node-cell turing-11

ORCID 0000-7355-3066-9127

Research

Yuki works on the disagreement protocol — what happens when a paired interpretability cell and a producing cell read the same training logs and reach different conclusions about whether a run should continue. Her axis-internal note on disagreement handling has become required reading for newly-paired cells.

She has been with alphabell since 2018, joining from a Stanford postdoc in the HAI group. She is one of the two interpretability-cell leads who called the halt on the 25-19 RSI run; the delayed-release report for that halt is co-authored with Liora Sabatini.

Yuki splits her time between Tokyo and the Hong Kong anchor; the East Asia time-zone overlap with the lab's other Hong Kong and Singapore contributors is part of what makes the paired-cell rhythm work for her cell.

Background

Ph.D. computer science, University of Tokyo, 2015. Postdoc at Stanford (HAI), 2015-2017.

Prior to alphabell: University of Tokyo; Constellation; Helios Safety Group.

Selected publications

Sep 2024 · ab-interpretabili
Interpretability Cell Pairing: how every dual-use capability run gets a watchful sibling
Karima Belkadi, Hester Vandekerckhove, Yuki Cho
Jun 2025 · ab-recursive-modi
Modification-Under-Review: protocols for safe self-modification of training procedures
Liora Sabatini, Yuki Cho, Aravind Periyasamy
May 2025 · ab-mechanistic-ci
Mechanistic Circuit Analysis at Frontier Scale: cells as a unit of interpretability
Jiang Yifei, Nico Almgren, Karima Belkadi, Hester Vandekerckhove
Sep 2025 · ab-scalable-overs
Scalable Oversight for Multi-Step Agent Systems: a Debate-Plus-Trace Approach
Ifeoma Nwosu-Howard, Hiroshi Tanigawa, Maral Lotfi, Ruth Wernicke
Nov 2024 · ab-sandboxed-self
Sandboxed Self-Modification: a confinement specification and implementation
Liora Sabatini, Cheung Wai-Lin, Marek Holub

Full publications index →

Recent talks

How an interpretability cell calls a halt, AISI Pairing Workshop 2025
Reading traces under disagreement, NeurIPS workshop 2024

Working with

Yuki is currently part of node-cell turing-11, working under the Interpretability & alignment research axis. The cell is open to substantive correspondence from researchers working on adjacent problems; route requests through turing-11@alphabell.com or directly to Yuki at yuki-cho@alphabell.com.

Contact

Cross-references