Genos: a human-centric genomic foundation model.
Adi Lin, Bin Xie, Cheng Ye, Cheng Wang, Duoyuan Chen, Ercheng Wang, Fanfeng Lu, Guirong Xue, Haiqiang Zhang, Jiajie Zhan, Jianfeng Zhang, Jiangshuan Pang, Jianqiang Liang, Jiawei Lin, Jiaxin Ma
Abstract
Open AccessBACKGROUND: The rapid expansion of human genomic data demands foundation models that manage ultra-long sequences and capture population diversity, limitations common in existing models that lack human-specific representation, and clinical inference efficiency. RESULTS: Here, we introduce Genos (Genos-1.2B/Genos-10B), a human-centric genomic foundation model engineered for million-basepair sequence modeling. Genos utilizes a large-scale mixture of experts structure, optimized for a 1-Mb context, trained on high-quality human de novo assemblies from datasets such as the Human Pangenome Reference Consortium and the Human Genome Structural Variation Consortium, representing diverse global populations. A suite of optimization strategies was implemented to ensure training stability and enhance computational efficiency, which collectively reduces costs and facilitates million-basepair context modeling. Functionally, Genos performs single-nucleotide resolution analysis and dynamically simulates the cascade effects of noncoding variations on RNA expression profiles. In comprehensive evaluations, Genos uniformly surpasses state-of-the-art models on critical human genomics benchmarks and demonstrates robust omics-text cross-modal diagnostic capabilities. We present a systematic technical evaluation and validation of Genos's architecture, training convergence, and performance across standard benchmarks. CONCLUSIONS: This work provides a reliable technical blueprint and performance benchmark for the development of the next generation of high-efficiency genomic foundation models. Genos model weights, inference code, and usage documentation are publicly available on GitHub (https://github.com/BGI-HangzhouAI/Genos) and Hugging Face Hub (https://huggingface.co/BGI-HangzhouAI).