Leveraging Machine Learning to Predict Mobility Outcomes in Traumatic Brain Injury: A Retrospective Analysis
Abstract
Introduction: Traumatic brain injury (TBI) is a significant global health concern, impacting over 69 million individuals annually and leading to high rates of death and disability. Despite extensive research on TBI, there remains considerable variability in patient outcomes, largely due to differences in treatment approaches across various medical centers. This inconsistency underscores the urgent need for more accurate predictive models. Our research aims to develop a machine-learning model capable of predicting Johns Hopkins Highest Level of Mobility (JH-HLM) scores, a critical metric for assessing hospital performance and patient progress. Specifically, we focus on distinguishing whether a patient's JH-HLM score falls below 4 or meets/exceeds 4. We hypothesize that a machine-learning model, integrating both demographic and clinical data, can reliably predict short-term recovery outcomes as indicated by the JH-HLM scale. This approach has the potential to enhance medical decision-making, improve patient outcomes, and minimize treatment variability across healthcare institutions.
Methods: Study Design and Data Collection: We conducted a retrospective cohort study using data from 134 patients with diverse medical histories. The dataset included critical features such as the Glasgow Coma Scale (GCS) score at admission, patient age, gender, lesion size, midline shift, admitting diagnosis, past medical history (PMH), and the number of comorbidities. Inclusion Criteria: •Patients aged 16 years or older •Confirmed diagnosis of traumatic brain injury (TBI) •Complete data records for all key features •Underwent assessment using the Johns Hopkins Highest Level of Mobility (JH-HLM) scale Target Variable: The primary outcome was the JH-HLM score, which was binarized into two categories: •Non-Mobile (0): JH-HLM score less than 4 •Mobile (1): JH-HLM score of 4 or greater Class Distribution: •Non-Mobile Group: 42 patients (31.3%) •Mobile Group: 92 patients (68.7%) Software and Data Analysis: •Tools: Python, Scikit-learn •Analysis Approach: o Conducted feature importance analysis to identify key predictors o Evaluated model performance using metrics such as accuracy, precision, recall, F1-score, and AUC-ROC
Results: Fig 1. Model Accuracy and Recall Scores The bar plot visualizes the performance of six different machine learning models in terms of two key metrics: accuracy and recall. These models were trained to predict whether patients fall into one of two categories based on their JH HLM scores: less than 4 or 4 and greater. The bar plot demonstrates that several models, particularly SVC, KNN, Random Forest, and XGBoost, perform well in terms of both accuracy and recall. This suggests they are reliable for predicting patient categories based on JH HLM scores. Logistic Regression also shows strong performance, while Naive Bayes, despite lower accuracy, maintains a high recall, making it useful in specific contexts. Table 1. Model Performance Comparison Accuracy: Most models, such as SVC, KNN, Random Forest, and XGBoost, achieved high accuracy (0.9268), indicating they correctly predicted the outcomes for over 92% of cases. This high accuracy underscores the reliability of these models in general predictions. Precision and Recall: Precision values are consistently high across models, with Naive Bayes achieving the highest precision (0.9286). Recall is perfect (1.0) for several models, including SVC, KNN, Random Forest, and XGBoost, indicating these models are highly effective in identifying true positive cases of patient mobility. F1 Score: The F1 Score balances precision and recall, with most models showing a high F1 score (0.962), suggesting a robust performance in identifying true positives while minimizing false positives. However, the Naive Bayes model's F1 score is lower (0.7879), indicating potential trade-offs between precision and recall. AUC-ROC: The AUC-ROC values are notably lower, with some models performing at the level of random guessing (0.5). The Naive Bayes model stands out slightly with an AUC-ROC of 0.5088, but this still indicates limited discriminative ability. This discrepancy suggests a need to re-evaluate the models' capacity to distinguish between different levels of mobility accurately. Fig. 2 - Feature Importance Analysis This figure shows that the feature that influenced our output variable (JH-HLM Binarized) was "Age at Admission". Future models should focus on incorporating this feature and similar ones into their predictive analysis.
Conclusion: In conclusion, our study on predicting Johns Hopkins Hospital Levels of Mobility (JH HLM) scores using machine learning models has revealed both promising results and important limitations. We've demonstrated that advanced algorithms like Random Forest and XGBoost can potentially aid in predicting patient mobility outcomes, which could significantly impact treatment planning and resource allocation in clinical settings. However, we must acknowledge the constraints of our current approach. The limited sample size of 134 patients and class imbalance in our target variable pose challenges to the model's generalizability. Moreover, the complexity of some models raises concerns about interpretability, a crucial factor in medical applications where transparency in decision-making is paramount. Moving forward, our research opens up several exciting avenues for future work. Expanding our dataset, addressing class imbalance, and exploring more interpretable models are key priorities. We also recognize the need for external validation to ensure our models perform consistently across diverse patient populations and healthcare settings. Ultimately, this study represents a significant step towards leveraging artificial intelligence in enhancing patient care. By continuing to refine our methods and addressing the limitations we've identified, we aim to develop robust, clinically relevant tools that can truly make a difference in patient outcomes.