Below is an index of posts by topic area. To the right is a search box.

Python basics

Introduction, and installing python for healthcare modelling


Nested Lists





math module

Variable Types

Random numbers and sequences

if, else, elif, while, and logical operators; else after while

loops and iterating

List comprehensions – one line loops

try …. except (where code might fail)

Decimal places in output

Read from and write to files


Automatically passing unpacked lists or tuples to a function (or why do you see * before lists and tuples)

Lambda functions (one line functions), and map/filter/reduce

Accessing date and time, and timing code

NumPy and Pandas

NumPy and Pandas

NumPy basics: building an array from lists, basic statistics, converting to booleans, referencing the array, and taking slices

Pandas basics: building a dataframe from lists, and retrieving data from the dataframe using row and column index references

Pandas: basic statistics

Converting between NumPy and Pandas

Array maths in NumPy

Reading and writing CSV files using NumPy and Pandas

Applying user-defined functions to NumPy and Pandas

Adding more data to NumPy arrays and Pandas dataframes

Using Pandas to merge or lookup data

Sorting and ranking with Pandas

Using masks to filter data, and perform search and replace, in NumPy and Pandas

Summarising data by groups in Pandas using pivot_tables and groupby

Reshaping Pandas data with stack, unstack, pivot and melt

Subgrouping data in Pandas with groupby

Iterating through columns and rows in NumPy and Pandas

Removing duplicate data in NumPy and Pandas

Setting width and number of decimal places in NumPy print output

Using NumPy to generate random numbers, or shuffle arrays

Matplotlib for plotting charts

Simple xy line charts, and simple save to file

Scatter plot, and adding titles to axes

Bar charts

Pie charts, and adding a title

Histograms (and obtaining histogram data with NumPy)


Violin plots

3D wireframe and surface plots

Common modifications to charts

A simple heatmap

Adding contour lines to a heatmap

Creating a grid of subplots

Adding error bars to charts

Adding shaded areas to charts


Linear regression with scipy.stats

Linear regression with scikit learn

One sample t-test and Wilcoxon signed rank test

t-tests for testing the difference between two groups of data

Mann Whitney U-test

Analysis of variance (ANOVA)

Multi-comparison with Tukey’s test and the Holm-Bonferroni method

Multiple comparison of non-normally distributed data with the Kruskal-Wallace test

Confidence Interval for a single proportion

Chi-squared test

Fisher’s exact test

Distribution fitting to data

Clinical pathway simulation with SimPy

A simple bed occupancy model

A simple bed occupancy model (object-based)

A hospital bed occupancy model with queuing for a limited number of beds (object based)

An emergency department model in SimPy, with patient prioritisation and capacity limited by doctor availability (object based)

Machine Learning with Scikit Learn

The iris data set

Splitting data into training and test sets

Feature Scaling

Using logistic regression to diagnose breast cancer

Adding standard diagnostic performance metrics to a ml diagnosis model

How do you know if you have gathered enough data? By using learning rates.

Working with ordinal and categorical data

Support Vector machines

Random Forests

Neural networks

Choosing between models with stratified k-fold validation

Visualising accuracy and error in a classification model with a confusion matrix

Changing sensitivity of machine learning algorithms and performing a receiver-operator characteristic curve

Reducing data complexity, and eliminating covariance, with principal component analysis

Grouping unlabelled data with k-means clustering

Linear regression with scikit learn

Using free text for classification – ‘Bag of Words’

Worked machine learning example (for HSMA course)

Some common (and hopefully useful) algorithms

The travelling community nurse problem (aka the Travelling Salesman Problem)

Exploring the best possible trade-off between competing objectives: identifying the Pareto Front

Genetic algorithms 1. A simple genetic algorithm

Crowding distances: selecting solutions when too many multi-objective solutions exist

Miscellaneous Python

Function decorators

Speed up Python by 1,000 times or more using numba!


Open data travel times from all UK LSOA to all acute hospitals



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s