Evaluation of trajectory analysis for disease risk assessment: a scoping review.
Freya Pollington, Spiros C Denaxas, Kezhi Li, Johan H Thygesen, Georgios Lyratzopoulos, Becky White
Abstract
Open AccessOBJECTIVES: Increasingly, structured longitudinal electronic health records (EHRs) are being harnessed to predict risk of having present but as yet undetected disease by analyzing "patient trajectories." Trajectory studies explore clinical event associations, characterize disease trajectories, and enhance risk prediction. This scoping review assesses study characteristics and objectives, identifies model types, and appraises model performance and reporting. MATERIALS AND METHODS: We conducted a scoping review, focused on a PubMed and Web of Science search for studies using temporal EHR sequences to identify disease signatures or predict disease presence. RESULTS: We identified 62 studies. Statistical methods, such as testing temporal associations were primarily used for clustering, while deep learning models focused on outcome prediction. Sixty-five percent of studies used secondary care data, with the most common outcomes being disease agnostic (39%) and cardiovascular disease (20%). Forty-eight studies aimed at risk prediction, with 50% comparing trajectory-based models to static baselines. Among 31 studies reporting area under the curve (AUC), temporal models showed moderate performance gains (relative/absolute AUC: median 5.7%/4.2%, range -2.6% to 58.9%/-2.3% to 33.0%). DISCUSSION: Trajectory studies are increasing in volume, but lacking in application to primary care datasets, a diverse set of diseases, external validation, and consideration of clinical applicability. CONCLUSION: While the field's nascency hinders firm conclusions, there are promising results across a range of model types and objectives. Continued research from diverse perspectives will help determine whether this growing field can deliver meaningful clinical benefits.