Predicting autism spectrum disorder severity in children based on specific language milestones: a random forest model approach.
Haiyi Xiong, Xueli Xiang, Xiao Liu, Ting Yang, Jinjin Chen, Jie Chen, Tingyu Li
Abstract
Open AccessBACKGROUND: Language impairments are among the most prevalent co-occurring conditions in children with autism spectrum disorder (ASD), and delayed language milestones often serve as early developmental warning signs. However, it remains unclear whether specific language milestones can reliably predict the severity of ASD symptoms, particularly in regions where there is a long delay between initial screening and formal diagnosis. METHODS: This study included 574 children diagnosed with ASD, stratified into two age groups: under 4 years (n = 288) and 4 years or above (n = 286). A total of 33 language milestone items covering receptive, expressive, and pragmatic aspects were evaluated. The Boruta algorithm was applied to identify significant predictors of symptom severity, and random forest models were constructed separately for each age group. Nested cross-validation and grid search were used for hyperparameter tuning. Model performance was assessed using bootstrapping with 1,000 replications to estimate area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, and F1 scores. RESULTS: In children under 4 years, 14 features were identified as significant predictors of ASD severity, with "Identifies 1 picture" and "Expresses demands by language" ranked highest. In children aged 4 years and above, 16 features were significant, with "Identifies 2 colors" and "Calls partner by name" being the most influential. The random forest models demonstrated robust predictive performance, with AUC values of 0.81 ± 0.01 (younger group) and 0.85 ± 0.00 (older group). CONCLUSION: Our findings suggest that specific early language milestones, particularly those reflecting pragmatic abilities, may serve as valuable predictors of ASD severity. Leveraging these milestones in clinical practice could support earlier severity stratification and facilitate more tailored intervention planning, particularly in primary care settings.