Multilevel Psychometric Analysis of Clinician Burnout Using Bayesian G-Theory and IRT Fusion Models.
Simon Ntumi, Tapela Bulala
Abstract
Open AccessBurnout among intensive care unit (ICU) clinicians is a persistent threat to healthcare quality and clinician well-being, yet existing assessment methods often lack the contextual precision needed to guide effective interventions. This study introduces a novel integration of Bayesian hierarchical modeling, generalizability theory (G-Theory), and item response theory (IRT) to advance the psychometric assessment of burnout among intensive care clinicians a population particularly vulnerable to chronic occupational stress. The study examined the reliability and contextual sensitivity of burnout measurement tools among 462 ICU clinicians across 9 hospitals in Ghana, South Africa, and Botswana. By fusing G-Theory and IRT within a Bayesian framework, the research addressed a critical gap in understanding how burnout manifests across individuals, items, and institutional settings. The findings revealed that the greatest variance in burnout occurred at the individual level (emotional exhaustion [EE]: 18.72, P < .001; depersonalization [DP]: 12.65, P < .001), reflecting significant personal differences in burnout experiences. Item-level variance was also statistically significant (EE: 4.23, P = .001; DP: 3.12, P = .004), indicating effective item discrimination. Although hospital-level variance was smaller (EE: 2.05, P = .025; DP: 1.76, P = .045), it still pointed to contextual influences. Significant interaction effects (eg, person × item and person × hospital) further emphasized the complex interplay between individual traits and organizational environments. Moderate residual variance across EE (5.44, P = .001), DP (4.02, P = .001), and personal accomplishment (6.15, P < .001) suggested some unexplained variability that may warrant further qualitative exploration. IRT analysis supported the psychometric strength of the items, showing strong discrimination (EE: 1.17-1.54; DP: 1.09-1.40) and appropriate calibration for detecting moderate levels of burnout (EE difficulty: -1.20 to -0.68; DP: -1.33 to -0.71). Overall, the study validates the methodological advantage of combining G-Theory, IRT, and Bayesian modeling to yield more precise, reliable, and context-sensitive burnout assessments in critical care settings. It is recommended that healthcare institutions adopt a multilevel, psychometrically robust approach to monitoring clinician burnout leveraging G-Theory/IRT diagnostics to design unit-specific interventions and support policies that target both individual resilience and systemic reform.