A shrinkage-based statistical method for testing group mean differences in quantitative bottom-up proteomics.
Namgil Lee, Hojin Yoo, Juhyoung Kim, Heejung Yang
Abstract
Open AccessBACKGROUND: In bottom-up proteomics using data-independent acquisition mass spectrometry (DIA-MS), quantitative measurements are obtained following multiple steps of protein fragmentation and ionization, which introduces cumulative errors and impairs the effectiveness of classical statistical methods. This study proposes an alternative statistical approach for testing group mean differences at the peptide level in quantitative bottom-up proteomics. RESULTS: We present a novel probabilistic graphical model, that accounts for the non-normality of empirical distributions and the correlations between fragment ion quantities. Based on the model, we propose a new statistical method that improves upon the classical feature-based approach by incorporating distribution-free shrinkage estimation of covariance matrices and bootstrap-based estimation of degrees-of-freedom. Simulated experiments demonstrate that the proposed method outperforms the four most widely used classical methods in terms of specificity, sensitivity, and accuracy, particularly when the data distribution closely resembles real MS data, and under conditions of small sample sizes. Numerical analysis of real quantitative tandem mass spectrometry data reveals that the proposed method effectively identifies candidate peptides exhibiting changes in mean quantity following treatment with the kinase inhibitor Staurosporine. CONCLUSIONS: The proposed statistical method offers an effective alternative to classical approaches for differential analysis of peptides in quantitative bottom-up proteomics using DIA-MS. The R software package MDstatsDIAMS is available at https://github.com/namgillee/MDstatsDIAMS .