OmixLitMiner 2: Guided Literature Mining for Automated Categorization of Marker Candidates in Omics Studies.
Antonia Gocke, Bente Siebels, Jelena Navolić, Carla Reinbold, Julia E Neumann, Stefan Kurtz, Hartmut Schlüter
Abstract
Open AccessOmics analyses are crucial for understanding molecular mechanisms in biological research. The vast quantity of detected biomolecules presents a significant challenge in identifying potential biomarkers. Traditional methods rely on labor-intensive literature mining to extract meaningful insights from long lists of regulated candidates of biomolecules. To address this, we developed OmixLitMiner 2 (OLM2) to improve the efficiency of omics data interpretation, speed up the validation of results and accelerate further evaluation based on the selection of marker candidates for subsequent experiments. The updated tool utilizes UniProt for synonym and protein name retrieval and employs the PubMed database as well as PubTator 3.0 for searching titles or abstracts of available biomedical literature. It allows for advanced keyword-based searches and provides classification of proteins or genes with respect to their representation in the literature in relation to scientific questions. OLM2 offers improved functionality over the previous version and comes with a user-friendly Google Colab interface. In comparison to the previous version, OLM2 improves the retrieval of relevant publications and the classification of biomolecules. We use a case study of spatially resolved proteomic data from the mouse brain cortex to demonstrate that the tool significantly reduces the time compared to manual searches and enhances the interpretability of molecular analysis.