Abstract Background Microbiome sequencing has brought increasing attention to the polymicrobial context of chronic infections. However, clinical microbiology continues to focus on canonical human pathogens, which may overlook informative, but non-pathogenic, biomarkers. We address this disconnect in lung infections in people with cystic fibrosis (CF). Methods We collected health information (lung function, age, BMI) and sputum samples from a cohort of 77 children and adults with CF. Samples were collected during a period of clinical stability and 16S rDNA sequenced for airway microbiome compositions. We use Elastic Net regularization to train linear models predicting lung function and extract the most informative features. Results Models trained on whole microbiome quantitation outperform models trained on pathogen quantitation alone, with or without the inclusion of patient metadata. Our most accurate models retain key pathogens as negative predictors ( Pseudomonas, Achromobacter ) along with established correlates of CF disease state (age, BMI, CF related diabetes). In addition, our models select non-pathogen taxa ( Fusobacterium, Rothia ) as positive predictors of lung health. Conclusions These results support a reconsideration of clinical microbiology pipelines to ensure the provision of informative data to guide clinical practice.
This paper's license is marked as closed access or non-commercial and cannot be viewed on ResearchHub. Visit the paper's external site.