aiGeneR 3.0: an enhanced deep network model for resistant strain identification and multi-drug resistance prediction in Escherichia coli causing urinary tract infection using next-generation sequencing data.
Debasish Swapnesh Kumar Nayak, Abhilash Pati, Amrutanshu Panigrahi, Mudassir Khan, Bayan Alabdullah, Santanu Kumar Sahoo, Bibhuprasad Sahu, Abrar Almjally, Saurav Mallik, Tripti Swarnkar
Abstract
Open AccessBackground: Infectious diseases pose a global health threat, with antimicrobial resistance (AMR) exacerbating the issue. Considering Escherichia coli (E. coli) is frequently linked to urinary tract infections, researching antibiotic resistance genes in this context is essential for identifying and combating the growing problem of drug resistance. Objective: Machine learning (ML), particularly deep learning (DL), has proven effective in rapidly detecting strains for infection prevention and reducing mortality rates. We proposed aiGeneR 3.0, a simplified and effective DL model employing a long-short-term memory mechanism for identifying multi-drug resistant and resistant strains in E. coli. The aiGeneR 3.0 paradigm for identifying and classifying antibiotic resistance is a tandem link of quality control incorporated with DL models. Cross-validation was adopted to measure the ROC-AUC, F1-score, accuracy, precision, sensitivity, specificity, and overall classification performance of aiGeneR 3.0. We hypothesized that the aiGeneR 3.0 would be more effective than other baseline DL models for antibiotic resistance detection with an effective computational cost. We assess how well our model can be memorized and generalized. Results: Our aiGeneR 3.0 can handle imbalances and small datasets, offering higher classification accuracy (93%) with a simple model architecture. The multi-drug resistance prediction ability of aiGeneR 3.0 has a prediction accuracy of 98%. aiGeneR 3.0 uses deep networks (LSTM) with next-generation sequencing (NGS) data, making it suitable for novel antibiotics and growing resistance identification in the future. Conclusion: This work uniquely integrates SNP-level insights with DL, offering potential clinical utility in guiding antibiotic stewardship. It also enables a robust, generalized, and memorized model for future use in AMR analysis.