Stabilized marker gene identification and functional annotation from single-cell transcriptomic data.
Sandesh Acharya, Pathum Kossinna, Qingrun Zhang, Jiami Guo
Abstract
Open AccessWith the rapid emergence of single-cell transcriptomics datasets, reproducible marker genes and functional annotation of cell type or state is becoming increasingly important. Conventional methods that rely on differential gene expression (DEG) analysis lack both consistency across datasets and functional annotations of selected markers. Here, we present scSCOPE, an R-based platform that utilizes stabilized LASSO (Least Absolute Shrinkage and Selection Operator) feature selection, bootstrapped co-expression networks, and pathway enrichments to identify reproducible and functionally relevant marker genes and associated pathways in scRNAseq datasets. Using 9 scRNAseq datasets from human and mouse immune cells generated by different sequencing technologies, we show that scSCOPE outperforms other conventional methods by automatically identifying cell type-specific marker genes and pathways with the highest consistency across all datasets. scSCOPE's gene co-expression and pathway analyses also provide in-depth molecular insights into the functionality of identified marker genes. We anticipate that scSCOPE will greatly improve cell type annotation and accelerate the design of experimental validation and functional investigations on cell heterogeneity.