An Egocentric Life-Saving Interventional Procedure Dataset of Actions, Medical Questions, Maneuvers and Tools.
Yupeng Zhuo, Eddie Zhang, Xiangchen Yu, Aditya Pachpande, Wenqu Fang, Xiyue Chen, Andrew W Kirkpatrick, Kyle Couperus, Christopher Colombo, Oanh Tran, Jonah Beck, DeAnna DeVane, Jessica McKee, Chad Gorbatkin, Ross Candelore
Abstract
Open AccessThis paper introduces the Trauma THOMPSON dataset, designed to advance AI-driven decision support for life-saving interventions (LSIs) in emergency care, particularly in resource constrained humanitarian settings. The dataset comprises 3,717 high resolution and egocentric video clips of both regular and just-in-time (JIT) procedures. The JIT procedures consists of videos of the same LSI procedures, but with makeshift tools, and is useful for studying human medical commonsense. Each clip is annotated by medical professionals with verb-noun format, such as "take scalpel" and "make incision". In addition to action segments, the dataset includes annotations for medical visual question answering (MVQA), hand maneuvers, and object detection. Eventually, these rich annotations and dataset can be used to train an AI agent to advise first-responders in the field about what to do next with the resources at hand. We provide benchmarks for action recognition, anticipation, and MVQA using state-of-the-art machine learning models.