Gene expression signatures from whole blood predict amyotrophic lateral sclerosis case status and survival.
Yue Zhao, Masha G Savelieff, Xiayan Li, Kai Guo, Kai Wang, Minghua Li, Bo Li, Gayatri Iyer, Stacey A Sakowski, Lili Zhao, Samuel J Teener, Kelly M Bakulski, John F Dou, Bryan J Traynor, Alla Karnovsky
Abstract
Open AccessAmyotrophic lateral sclerosis (ALS) is a rare and fatal neurodegenerative disease with a median survival of only 2 to 4 years from diagnosis. Improved tools are needed to shorten diagnostic delays and improve prognostication to benefit clinical care. Herein, we profiled whole blood gene expression by RNA sequencing in a large cohort of ALS participants (n = 422) versus controls (n = 272). Several machine learning classifiers trained on our detailed gene expression dataset accurately predicted case-control status, including in a fully independent external test cohort, achieving an area under the receiver operating characteristic curve of 0.894 with the best performing model. Integrating gene expression features with clinical variables improved our ability to discriminate ALS cases into shorter, intermediate, and longer survival in an external dataset. Finally, we identified ALS-relevant pathways in our blood transcriptomics dataset as well as "core genes" that overlapped with gene expression changes occurring in the primary disease tissue, facilitating a drug perturbation analysis that identified several candidates. Overall, our results highlight the potential diagnostic and prognostic applications of whole blood gene expression data, with important implications for improving ALS clinical care.