97: Simple machine learning model to predict emergency department (ED) breaches of the four-hour target

In England emergency departments have a target that 95% of patients should be admitted or discharged from ED within four hours. Patients waiting more than four hours are known as ‘breaches’

This notebook explores predicting emergency department (ED) breaches (patients taking more than 4 hours to be discharged or admitted). The data is from a real mid-sized acute hospital in England.

The model receives data every 2 hours and predicts whether there will be a breach in the next 2 hours.

It uses some basic ED data alongside whole-hospital data (number of occupied beds and total beds) to try to predict whether there are likely to be breaches in the next two hours. It uses a simple logistic regression model to achieve 80% accuracy in predicting breaches. Sensitivity may be adjusted to balance accuracy in predicting beach and non-breaching episodes (80% accuracy may be be simultaneousness achieved in both).

import pandas as pd
data = pd.read_csv('ed_1.csv')

Show data columns:

print (list(data))
['snapshot_id', 'snapshot_date', 'snapshot_time', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday', 'Number of Patients In department >= 4 Hours', 'Total Number of Patients in the Department', 'Number of Patients in Resus', 'Number of Patients Registered in Last 60 Minutes', 'Number of Patients Waiting Triage', 'Number of Patients Waiting to be Seen (ED)', 'Number of Patients Waiting to be Seen (Medical)', 'Number of Patients Waiting to be Seen (Surgery)', 'Number of Patients > 3 Hours', 'Number of Patients Waiting a Bed', 'Number of Patients Left Department in Last 60 Minutes', 'Free_beds', 'Breach_in_next_timeslot']

Separate data into features (X) and label (Y) to predict. Y is whether there are breaches in the following 2 hours.

X = data.loc[:,"Monday":"Free_beds"]
y = data['Breach_in_next_timeslot']

Let’s see what proportion of 2 hour epochs have a breach:

print (data['Breach_in_next_timeslot'].mean())

Split data in training and test sets

from sklearn.model_selection import train_test_split

Normalise data with standard scaling

from sklearn.preprocessing import StandardScaler

# Initialise a new scaling object for normalising input data

# Set up the scaler just on the training set

# Apply the scaler to the training and test sets

Build a logistic regression model

C=1000 sets low regularisation. If accuracy of training data is significantly higher than accuracy of test data this should be reduced in 10-fold or 3-fold steps to maximise accuracy of test data.

(Note: the ‘;’ at the end of the last line suppresses model description output in the Jupyter Notebook)

from sklearn.linear_model import LogisticRegression

ml = LogisticRegression(C=1000)

Predict training and test set labels

Our model is now built. We can now predict breaches for training and test sets. The results for the test set gives the better description of accuracy, but it is useful to calculate both to look for ‘over-fitting’. If the training data has significantly better accuracy than the test data then it is likely the model is ‘over-fitted’ to the training data, and the regularisation term (C) in the model fit above should be reduced step-wise – this will reduce accuracy of predicting the training data, but will increase the accuracy of the test data, though too high regularisation (low C) will reduce the accuracy of both predicting training and test data.

InĀ [8]:
# Predict training and test set labels
y_pred_train = ml.predict(X_train_std)
y_pred_test = ml.predict(X_test_std)

Test accuracy

import numpy as np
accuracy_train = np.mean(y_pred_train == y_train)
accuracy_test = np.mean(y_pred_test == y_test)
print ('Accuracy of predicting training data =', accuracy_train)
print ('Accuracy of predicting test data =', accuracy_test)
Accuracy of predicting training data = 0.8111326090191993
Accuracy of prediciing test data = 0.8151785714285714

Display weights (coefficients) of model.

# Create table of weights
weights_table = pd.DataFrame()
weights_table['feature'] = list(X)
weights_table['weight'] = ml.coef_[0]
                                              feature    weight
0                                              Monday  0.038918
1                                             Tuesday -0.026935
2                                           Wednesday  0.001615
3                                            Thursday  0.001543
4                                              Friday -0.014975
5                                            Saturday  0.011287
6                                              Sunday -0.011401
7         Number of Patients In department >= 4 Hours  1.515722
8          Total Number of Patients in the Department  0.544407
9                         Number of Patients in Resus  0.307983
10   Number of Patients Registered in Last 60 Minutes -0.444304
11                  Number of Patients Waiting Triage  0.028371
12         Number of Patients Waiting to be Seen (ED)  0.138082
13    Number of Patients Waiting to be Seen (Medical) -0.036093
14    Number of Patients Waiting to be Seen (Surgery)  0.022757
15                       Number of Patients > 3 Hours  1.265580
16                   Number of Patients Waiting a Bed  0.013085
17  Number of Patients Left Department in Last 60 ... -0.001884
18                                          Free_beds -0.369558


Define a function for sensitivity and specificity

Sensitivity = proportion of breaching periods correctly identified
Specificity = proportion of breaching periods correctly identified

def calculate_sensitivity_specificity(y_test, y_pred_test):
    # Note: More parameters are defined than necessary. 
    # This would allow return of other measures other than sensitivity and specificity
    # Get true/false for whether a breach actually occurred
    actual_pos = y_test == 1
    actual_neg = y_test == 0
    # Get true and false test (true test match actual, false tests differ from actual)
    true_pos = (y_pred_test == 1) & (actual_pos)
    false_pos = (y_pred_test == 1) & (actual_neg)
    true_neg = (y_pred_test == 0) & (actual_neg)
    false_neg = (y_pred_test == 0) & (actual_pos)
    # Calculate accuracy
    accuracy = np.mean(y_pred_test == y_test)
    # Calculate sensitivity and specificity
    sensitivity = np.sum(true_pos) / np.sum(actual_pos)
    specificity = np.sum(true_neg) / np.sum(actual_neg)
    return sensitivity, specificity, accuracy

Show sensitivity and specificity:

sensitivity, specificity, accuracy = calculate_sensitivity_specificity(y_test, y_pred_test)
print ('Sensitivity:', sensitivity)
print ('Specificity:', specificity)
print ('Accuracy:', accuracy)
Sensitivity: 0.8488529014844804
Specificity: 0.7493403693931399
Accuracy: 0.8151785714285714

So we are better at detecting breaches than non-breaches. This is likely because breaching sessions occur more often. Let’s adjust our model cut-off to balance the accuracy out. We’ll vary the cut-off we use and construct a sensitivity/specificity plot (very similar to a ‘Receiver-Operator Curve’ or ‘ROC’).

Balancing sensitivity and specificity

cuttoff = np.arange (0.01,1.01,0.01)
sensitivity_results = []
specificity_results = []

for threshold in cuttoff:
    # linear regression model has .predict+proba  method to return 
    # probability of outcomes. Some methods such as svc use 
    # .decision_function to return probability
    # Get test results 
    y_pred_probability = ml.predict_proba(X_test_std)
    # Check probability of positive classification is >trhreshold
    y_pred_test = (y_pred_probability[:,1] >= threshold)
    # Convert boolean to 0/1 (could also simply multiple by 1)
    y_pred_test = y_pred_test.astype(int)
    # Get sensitivity and specificity
    sensitivity, specificity, accuracy = \
        calculate_sensitivity_specificity(y_test, y_pred_test)
    # Add results to list of results

Plotting specificity against sensitivity:

import matplotlib.pyplot as plt

%matplotlib inline

fig = plt.figure(figsize=(5,5))
ax1 = fig.add_subplot(111)

x = sensitivity_results
y = specificity_results

ax1.grid(True, which='both')
ax1.set_xlabel('Sensitivity (proportion of breaching\nperiods predicted correctly)')
ax1.set_ylabel('Specificity (proportion of non-breaching\nperiods predicted correctly)')


Plotting specificity against sensitivity shows we can adjust our machine learning cut-off to simultaneously achieve 80% accuracy in predicting likelihood of breaches in the next 2 hours.

67. Machine learning: Adding standard diagnostic performance metrics to a ml diagnosis model

Machine learning diagnostic performance measures:
accuracy = 0.937
sensitivity = 0.933
specificity = 0.943
positive_likelihood = 16.489
negative_likelihood = 0.071
false_positive_rate = 0.057
false_negative_rate = 0.067
positive_predictive_value = 0.966
negative_predictive_value = 0.893
precision = 0.966
recall = 0.933
f1 = 0.949


Continue reading “67. Machine learning: Adding standard diagnostic performance metrics to a ml diagnosis model”