Journal of medical Internet researchHumansCross-Over StudiesArtificial IntelligenceCohort StudiesMale

AI-Enhanced Social Robotic Versus Computer-Based Virtual Patients for Clinical Reasoning Training in Medical Education: Observational Crossover Cohort Study.

Alexander Borg, Jonathan Schiött, William Ivegren, Cidem Gentline, Viking Huss, Anna Hugelius, Benjamin Jobs, Fabricio Espinosa, Mini Ruiz, Samuel Edelbring, Carina Georg, Gabriel Skantze, Ioannis Parodis

Published: 202510.2196/82541

Abstract

Open Access

BACKGROUND: Virtual patient (VP) simulations can be used to practice clinical reasoning (CR) in controlled learning environments. Traditional computer-based VP platforms often lack the authenticity and interactivity required for effective CR training. Artificial intelligence (AI)-enhanced social robotic VPs can enhance realism and engagement; however, quantitative evidence comparing them with conventional VP platforms remains limited. OBJECTIVE: We compared medical students' experience of an AI-enhanced social robotic versus a conventional computer-based VP platform regarding the extent to which the design characteristics of the respective platform facilitate CR skill training. METHODS: This observational crossover cohort study involved 178 sixth-semester medical students at Karolinska Institutet, Stockholm, Sweden (response rate: 42.3%; 178 of 421 invited students; Spring 2024-Spring 2025), who experienced both a large language model-enhanced social robotic VP platform supporting dialogue (social artificial intelligence-enhanced robotic interface [SARI]) and a conventional computer-based VP platform (virtual interactive case [VIC]) during their clinical rotation within rheumatology. Platform order was determined by clinical rotation scheduling. VP design was evaluated using a validated questionnaire across 5 domains: authenticity, professional approach, coaching quality, learning effects, and overall judgment. Students' CR training preferences were assessed using categorical responses and a Visual Analogue Scale, where a lower score favored SARI and a score of 5 indicated equal preference between platforms. RESULTS: SARI outperformed VIC across all 5 VP design domains. Students rated SARI higher for authenticity (median 4.0, IQR 3.5-4.5 vs 3.0, IQR 2.5-3.5; P<.001), professional approach (median 4.5, IQR 4.0-4.8 vs 4.0, IQR 3.5-4.5; P<.001), coaching quality (median 4.3, IQR 4.0-4.7 vs 4.0, IQR 3.7-4.7; P<.001), learning effect (median 4.4, IQR 4.0-5.0 vs 4.0, IQR 3.5-4.5; P<.001), and overall judgment (median 5.0, 4.0-5.0 vs 4.0, IQR 4.0-5.0; P<.001). Students strongly preferred SARI for CR training (72% vs 14%; odds ratio [OR] 27.1, 95% CI 14.3-53.7; P<.001), with Visual Analogue Scale scores confirming this preference (median 3.0, IQR 2.0-5.0; P<.001). Preferences were consistent across most subgroups (sex, prior VP experience, and platform order); in 2 subgroups, the difference was not significant, that is, students with prior VP experience (62% vs 38%; OR 2.6; 95% CI 0.8-8.9; P=.11) and students first introduced to VIC (55% vs 45%; OR 1.5; 95% CI 0.7-2.9; P=.33). CONCLUSIONS: Our findings provide the first quantitative evidence that AI-enhanced social robotic VPs offer superior design characteristics than conventional computer-based platforms for CR training in medical education. These results support the use of AI-driven social robots for VP simulations to better prepare medical students for real clinical encounters, and warrant future research on objective CR skill outcomes and long-term transfer to clinical practice. Unlike previous qualitative studies examining each platform separately, this study provides the first quantitative comparison of design characteristics between AI-enhanced social robotic and conventional computer-based VPs.