Using Multiple Logistic Regression Analysis and Random Forest Models to Identify Factors in the Development of Low Back Pain during Postflexion in Children
Yushin YOSHIZATO, Kiyohisa NATSUME
Vol. 13 (2024) p. 197-204
The incidence of low back pain (LBP) among children is increasing. Although LBP can be classified into LBP during anteflexion (LBPAF) and LBP during postflexion (LBPPF), their causes remain unknown. In previous research, we focused on analysis of the causes of pediatric LBPAF by stepwise regression using multiple logistic regression (MLR) model, but the model tended to overfit. Therefore, this study aimed to explore pediatric LBPPF and examined the causes using MLR model with elastic net (ENET) and conditional inference forest (CIF) model. We enrolled 319 children aged 4-15 years, approximately 12% of whom had LBPPF and all were older than 7 years of age. Pediatric LBPPF exhibited an earlier age of onset and a higher prevalence when compared with previous data of LBPAF. The MLR, ENET-MLR, and CIF models were developed using 15 variables obtained from questionnaires and physical examinations conducted. The areas under the receiver operating characteristic curve (AUC) for the three models ranged from 0.68 to 0.72, with accuracy of 68-72%, sensitivity of 50-60%, and specificity of 71-77%. The ENET-MLR and CIF models showed higher accuracy and specificity than the MLR model. Furthermore, the test data accuracy rate did not decrease compared with the training data accuracy rate when using ENET-MLR, whereas the test data accuracy rate decreased more when using CIF. Notably, the ENET-MLR and MLR models identified the same three explanatory variables with low intercorrelation for discriminating between LBPPF and non-LBPPF, and their influence on LBPPF could be explained by the regression coefficients. These three variables were: a history of LBP, increased anterior thigh muscle flexibility, and increased spinal mobility and posterior thigh muscle flexibility. The results of the CIF model revealed that the same variables identified in the two MLR models were the most important variables when calculating the permutation variable importance measures based on the AUC. In conclusion, ENET-MLR is considered to be the best of the three models for discriminating pediatric LBPPF, because it has higher accuracy and specificity than MLR and less overfitting than CIF. Additionally, three factors can be used to discriminate pediatric LBPPF; namely, a history of LBP, increased anterior thigh muscle flexibility, and increased spinal mobility and posterior thigh muscle flexibility.