Here is an index of posts by topic:

# Python basics

Introduction, and installing python for healthcare modelling

if, else, elif, while, and logical operators; else after while

List comprehensions – one line loops

try …. except (where code might fail)

Lambda functions (one line functions), and map/filter/reduce

Accessing date and time, and timing code

# NumPy and Pandas

Converting between NumPy and Pandas

Reading and writing CSV files using NumPy and Pandas

Applying user-defined functions to NumPy and Pandas

Adding more data to NumPy arrays and Pandas dataframes

Using Pandas to merge or lookup data

Sorting and ranking with Pandas

Using masks to filter data, and perform search and replace, in NumPy and Pandas

Summarising data by groups in Pandas using pivot_tables and groupby

Reshaping Pandas data with stack, unstack, pivot and melt

Subgrouping data in Pandas with groupby

Iterating through columns and rows in NumPy and Pandas

Removing duplicate data in NumPy and Pandas

Setting width and number of decimal places in NumPy print output

Using NumPy to generate random numbers, or shuffle arrays

# Matplotlib for plotting charts

Simple xy line charts, and simple save to file

Scatter plot, and adding titles to axes

Pie charts, and adding a title

Histograms (and obtaining histogram data with NumPy)

3D wireframe and surface plots

Common modifications to charts

Adding contour lines to a heatmap

# Statistics

Linear regression with scipy.stats

One sample t-test and Wilcoxon signed rank test

t-tests for testing the difference between two groups of data

Multi-comparison with Tukey’s test and the Holm-Bonferroni method

Multiple comparison of non-normally distributed data with the Kruskal-Wallace test

Confidence Interval for a single proportion

# Clinical pathway simulation with SimPy

# Machine Learning with Scikit Learn

Splitting data into training and test sets

Using logistic regression to diagnose breast cancer

Adding standard diagnostic performance metrics to a ml diagnosis model

How do you know if you have gathered enough data? By using learning rates.

Working with ordinal and categorical data

Choosing between models with stratified k-fold validation

Visualising accuracy and error in a classification model with a confusion matrix

Reducing data complexity, and eliminating covariance, with principal component analysis

Grouping unlabelled data with k-means clustering

# Some common (and hopefully useful) algorithms

The travelling community nurse problem (aka the Travelling Salesman Problem)

# Miscellaneous Python

# Resources

Download zipped files (Jupyter Notebooks, PDFs, & py files)

Open data travel times from all UK LSOA to all acute hospitals