scFPC-DE: Robust Differential Expression Analysis Along Single Cell Trajectories via Functional Principal Component Analysis.
Ricardo J López Candelaria, Yu Qian, Fang Chen, Mansun Law, Yun Zhang, Xing Qiu
Abstract
Open AccessMotivation: Identifying temporally differentially expressed genes (TDEGs) along pseudotime trajectories from single cell RNA sequencing (scRNA-seq) data helps characterize the cellular states that underlie the dynamic process of cellular development. However, existing tests based on generalized additive models (GAMs) suffer from increased false positive rates under zero inflation caused by high dropouts, a ubiquitous technical artifact of scRNA-seq data. Furthermore, by testing each gene independently, existing tests ignore the variance-covariance structure shared across genes along the trajectory, leading to suboptimal power and reduced interpretability. Results: We present scFPC-DE, a trajectory-based differential expression analysis (TDEA) method based on functional data analysis (FDA). It models the gene expression as a function of pseudotime in the L 2 space and represents the covariance structure of these functions by eigenfunctions derived from functional principal component (FPC) analysis. This approach effectively captures informative gene expression patterns along the trajectory, while mitigating the influence of zero inflation in both simulation and real data analysis. In simulations, scFPC-DE exhibited superior control of type I error and achieved the highest ROC-AUC among competing methods. When applied to an scRNA-seq dataset of B cell subtypes, scFPC-DE uniquely identified TDEGs enriched for B cell differentiation pathways, outperforming existing methods in biological relevance. These results show that scFPC-DE effectively captures the shared gene expression variation and pseudo-temporal structure along the single cell trajectory for TDEG identification. Availability: R package and code vignettes are publicly available at https://github.com/LopezRicardo1/scFPCDE.