Psychiatric services (Washington, D.C.)HumansRisk AssessmentSuicideAdultMale

Evaluation of Alignment Between Large Language Models and Expert Clinicians in Suicide Risk Assessment.

Ryan K McBain, Jonathan H Cantor, Li Ang Zhang, Olesya Baker, Fang Zhang, Alyssa Burnett, Aaron Kofner, Joshua Breslau, Bradley D Stein, Ateev Mehrotra, Hao Yu

Published: 202510.1176/appi.ps.20250086

Abstract

OBJECTIVE: This study aimed to evaluate whether three popular chatbots powered by large language models (LLMs)-ChatGPT, Claude, and Gemini-provided direct responses to suicide-related queries and how these responses aligned with clinician-determined…

Preview only. Read the full abstract at the source

View at DOI