EXPLANA: a user-friendly workflow for EXPLoratory ANAlysis and feature selection in cross-sectional and longitudinal microbiome studies.
Jennifer Fouquier, Maggie Stanislawski, John O'Connor, Ashley Scadden, Catherine Lozupone
Abstract
Open AccessMOTIVATION: Longitudinal microbiome studies (LMS) are increasingly common but have analytic challenges including nonindependent data requiring mixed-effects models. Furthermore, large amounts of data motivate exploratory analysis to identify factors related to outcome variables. Although change analysis (i.e. calculating feature changes between timepoints) can be powerful, how to best conduct these analyses is often unclear. For example, observational LMS measurements show natural fluctuations, so baseline might not be a reference of primary interest, whereas for interventional LMS, baseline is typically a key reference point, often indicating the start of treatment. RESULTS: To address these challenges, a feature selection workflow, called EXPLANA (EXPLoratory ANAlysis), was developed for LMS that supports numerical and categorical data, and also accommodates cross-sectional studies. Machine learning methods were combined with different types of change calculations and downstream interpretation methods to identify statistically meaningful variables and explain their relationship to outcomes. EXPLANA generates an interactive report that textually and graphically summarizes methods and results. EXPLANA had good performance on simulated longitudinal data, with a balanced accuracy score of 0.91 (range: 0.79-1.00, SD = 0.05), outperformed an existing tool, QIIME 2 feature-volatility (balanced accuracy: 0.95 versus 0.56) and identified novel order-dependent categorical feature changes (e.g. different effect for A_B versus B_A). EXPLANA is broadly applicable and simplifies analytics for identifying features related to outcomes of interest. AVAILABILITY AND IMPLEMENTATION: Software is available at https://github.com/JTFouquier/explana and https://zenodo.org/records/17478745 (10.5281/zenodo.17478744). Documentation and demos are available at www.explana.io.