A Federated Compute Scheduler for an Asynchronous Research Lab
Pranav Iyer, Yusra Habibi, Akoss Vidor
@techreport{iyer2024federated,
title = {A Federated Compute Scheduler for an Asynchronous Research Lab},
author = {Iyer, Pranav and Habibi, Yusra and Vidor, Akoss},
year = {2024},
number = {Internal release — alphabell index 24/17},
institution = {alphabell},
month = {oct},
doi = {10.48550/arXiv.2410.84851},
url = {https://dev.alphabell.com/publications/federated-compute-scheduler}
}
Abstract
We describe the scheduler underlying alphabell's federated compute pool. Cells commit GPU and TPU capacity; access is allocated by a hybrid mechanism combining tenure-weighted priority, project signals, and quadratic voting among active contributors. We discuss two failure modes: collusion in QV rounds, and capacity hoarding by cells with long-running RSI training runs. Mitigations are documented in the open implementation.
Index metadata
- Cell
- polya-25
- Compute
- 8 H100-days (analysis only)
- Status
- Open release
- Code
- github.com/alphabell-labs/ab-scheduler
- DOI
- 10.48550/arXiv.2410.84851
- arXiv
- arXiv:2410.84851
What this paper is part of
This index entry is part of the Agentic engineering research axis. The producing cell — polya-25 — collaborates with adjacent cells listed in the cell directory. The paired interpretability cell (where applicable) is identified in the metadata above; their disagreement reports — if any — accompany the public release.
How to read this
If you want to use the result: the code (where available) is at https://github.com/alphabell-labs/ab-federate; the dataset is at TBD when one is released. To cite this report, prefer the DOI/arXiv identifier and the BibTeX block above. To discuss this with the producing cell, contact the lab with the index entry slug federated-compute-scheduler.
Limitations
Each cell-published report carries an explicit limitations section in the internal index. We do not paraphrase it here. Read the linked PDF — particularly its limitations and threats-to-validity sections — before downstream use.
Pranav Iyer, Yusra Habibi, Akoss Vidor. A Federated Compute Scheduler for an Asynchronous Research Lab. Internal release — alphabell index 24/17, Oct 2024. arXiv:2410.84851. doi:10.48550/arXiv.2410.84851.