Using large language models for temporal relation extraction from pediatric clinical reports.
Judith Jeyafreeda Andrew, Juliette Potier, Nicolas Garcelon, Anita Burgun, Marc Vincent
Abstract
Open AccessObjectives: To evaluate large language models (LLMs) for extracting temporal relations from pediatric rare disease clinical reports to enable automated patient timeline creation. Materials and Methods: We developed a temporal relation extraction framework for electronic health records, using 25 clinical reports from a pediatric rare disease hospital. We implemented few-shot prompting with 3 different LLMs in secure environments. Results: Our findings reveal that binary classification significantly outperforms multi-class approaches for temporal relation extraction, with best F1 scores reaching 0.70 for simpler relations while more complex relations remain challenging (F1: 0.03-0.40). Mistral 22B emerged as the strongest overall performer, though model superiority varied by relation type. Discussion: The dramatic performance improvement from reducing cognitive load (binary vs multi-class classification) demonstrates that task formulation critically impacts LLM effectiveness in specialized clinical domains. Our few-shot approach successfully enables temporal relation extraction from French pediatric texts while maintaining data privacy through local deployment, offering a viable methodology for healthcare institutions with strict data governance requirements. Conclusion: Our few-shot prompting approach demonstrated promising results in secure environments. This methodology allows technique sharing without exposing sensitive data, advancing research possibilities for clinical natural language processing in restricted settings.