Microbial Ecological Signatures Predict Pathogen Emergence and Multidrug Resistance in Cystic Fibrosis Airways up to a Year in Advance.
Thomas R Goddard, Jessica Ap Carlson-Jones, Morton Judith, Chee Y Ooi, Tai Andrew, Morgyn S Warner, Wong John, Ieuan Es Evans, Emily Hopkins, Jonathan R Iredell, Hubertus Pa Jersmann, Katrine L Whiteson, George Bouras, Michael P Doane, Nicholas W Falk
Abstract
Open AccessChronic infections in cystic fibrosis (CF) emerge from gradual ecological transitions in the airway microbiome, yet early predictive markers remain poorly defined. We developed a new autoencoder-based framework that outperforms read-based or metagenome-assembled genome-based analyses at capturing the continuum from health-associated commensals to pathogen-dominated, antibiotic-tolerant communities. This improvement is achieved by integrating taxonomic and functional data from 127 sputum and bronchoalveolar lavage metagenomes from 64 people with CF into latent "Clusters of Phylogeny and Functions" (COPFs). Coupled with gradient-boosted random forests, COPFs predicted Pseudomonas aeruginosa colonisation, multidrug resistance, and impending infection up to a year before clinical detection. The multidrug-resistant P. aeruginosa signature showed the same resistance-mechanism evolution as found in laboratory experiments. The inclusion of eukaryotic markers revealed persistent Aspergillus fumigatus signatures even during culture-negative intervals. Applying our South Australian-trained model to over 1,000 global metagenomes from 22 independent CF datasets, we achieved 94% accuracy in predicting P. aeruginosa status across platforms and geographies, validating the model's universal utility. Our results demonstrate that combining datasets with deep learning reveals conserved ecological and metabolic mechanisms in disease progression, transforming metagenomics into a predictive framework for managing chronic infections.