MySurgeryRisk Model Card

Model Details

Overview

MySurgeryRisk is an advanced predictive model designed to assess the likelihood of patients requiring prolonged mechanical ventilation (MV) following major surgical procedures. Specifically, it forecasts the risk of a patient needing mechanical ventilation for more than 48 hours post-surgery.

Figure 1. Temporal Associations Between Automated Real-Time Data Inputs and Outcome Prediction Windows
Figure 1. Temporal Associations Between Automated Real-Time Data Inputs and Outcome Prediction Windows

Owners

University of Florida Intelligent Clinical Care Center (ic3-center@ufl.edu)

Version

v1.0, Dec 5, 2024

License

CC BY-NC 4.0

Model Sources

Model Parameters

Architecture

The model is a random forest classifier.

Input

Tabular data with 78 features including 1) Socio-demographics (e.g., age, sex, race, ethnicity, language, area median income); 2) Admission information (e.g., emergent admission, admission source, night admission); 3) Comorbidities (e.g., diabetes, hypertension, cancer); 4) Scheduled procedure information (e.g., procedure code, surgeons, anesthesia type); 5) Historical medications (e.g., vancomycin, aspirin, beta-blokers); 6) Preoperative laboratory results (i.e., serum creatinine, hemoglobin, serum anion gap)

Additional Document

Output

The model outputs a probability score, ranging from 0 to 1, indicating the likelihood of a patient requiring prolonged MV post-surgery.

Training Datasets

UFH Gainesville training dataset: The dataset included all patients 18 years or older who were admitted to University of Florida Health (UFH) Gainesville for any type of inpatient surgical procedure. The final cohort consisted of 41,812 patients who received 52,117 procedures between June 1, 2014 and November 27, 2018. Each patient's medical record contained heterogeneous variables (eg, demographic characteristics and medical history, diagnoses and procedures, medications, laboratory results, and vital signs).

Labeling: The use of mechanical ventilation was identified using EHR data representing respiratory devices, ventilation modes, and measured values for respiratory vitals that include oxygen flow rate, tidal volume, and positive end-expiratory pressure. The detailed logic for mechanical ventilation identification is illustrated in Figure 2. Additionally, the outcome distribution was present in Figure 3.

Figure 2. The logic for the identification of mechanical ventilation use
Figure 2. The logic for the identification of mechanical ventilation use
Figure 3. The outcome distribution of training dataset across all patients and subgroups stratified by sex, race and age
Figure 3. The outcome distribution of training dataset across all patients and subgroups stratified by sex, race and age

Evaluation Datasets

UFH Gainesville evaluation dataset: The dataset included all patients 18 years or older who were admitted to University of Florida Health (UFH) Gainesville for any type of inpatient surgical procedure. The final cohort consisted of 19,132 patients who received 22,300 procedures between November 28, 2018 and September 20, 2020. We present the outcome distribution in Figure 4.

Figure 4. The outcome distribution of the evaluation dataset across all patients and subgroups stratified by sex, race and age
Figure 4. The outcome distribution of the evaluation dataset across all patients and subgroups stratified by sex, race and age

Training Details

The model was trained on the entire training dataset using the selected hyperparameters, which were selected using 5-fold cross validation.

Training Hyperparameters

min_samples_leaf10
n_estimators1500
max_features10
class_weightbalanced

Quantitative Analysis

Metrics

The model performance was evaluated using several metrics, including area under the receiver operating characteristic curve (AUROC), area under the precision recall curve (AUPRC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (APV). 95% confidence intervals (CI) for all performance measures were calculated using bootstrao sampling and nonparametric methods. Detailed evaluation result was presented in the 'Evaluation Results' section.

Evaluation Results

UFH Gainesville evaluation dataset
AUROC0.91 (0.9-0.91)
AUPRC0.45 (0.41-0.48)
NPV0.99 (0.99-0.99)
PPV0.21 (0.2-0.24)
Sensitivity0.85 (0.82-0.87)
Specificity0.82 (0.8-0.84)
Figure 5. AUROC curve for prolonged mechanical ventilation complication
Figure 5. AUROC curve for prolonged mechanical ventilation complication
Figure 6. AUPRC curve for prolonged mechanical ventilation complication
Figure 6. AUPRC curve for prolonged mechanical ventilation complication

Explainability

Utilizing SHapley Additive exPlanations (SHAP) on the evaluation dataset, we identified the key features contributing to prolonged MV risk prediction, as illustrated in Figure 7. The primary procedure code emerged as the most significant feature. Other top contributors included the attending surgeon, preoperative serum calcium and glucose levels, and surgery type, all ranking among the five most influential features.

Figure 7. Ten important features contributing to the prolonged MV risk prediction
Figure 7. Ten important features contributing to the prolonged MV risk prediction

Bias and Fairness

We evaluated the bias from the dataset and the prediction model across three sensitive attribute including sex, race and age. The evaluation results are shown in Figure 8 and Figure 9. We observed that while our prediction model and dataset satisfies several important fairness criteria, such as statistical parity, average odds, and equal opportunity, the disparate impact metric indicates potential unfairness in terms of selection rates across different groups (sex and race). This suggests that while the model maintains overall balance in its predictions, there may be subtle distributional differences that disproportionately affect certain groups (Figures 3 and 4). The 80% rule or Four-Fifths Rule has been applied to determine if there is bias.

Metrics

Figure 8. The summary of dataset bias across subgroups stratified by sex, race and age
Figure 8. The summary of dataset bias across subgroups stratified by sex, race and age
Figure 9. The summary of model bias across subgroups stratified by sex, race and age
Figure 9. The summary of model bias across subgroups stratified by sex, race and age

Consideration

Primary Use Cases

Intended Users

Out of Scope Use Cases

Limitations

Ethic Considerations