Below is an index of posts by topic area. To the right is a search box.
Python basics
Introduction, and installing python for healthcare modelling
if, else, elif, while, and logical operators; else after while
List comprehensions – one line loops
try …. except (where code might fail)
Lambda functions (one line functions), and map/filter/reduce
Accessing date and time, and timing code
NumPy and Pandas
Converting between NumPy and Pandas
Reading and writing CSV files using NumPy and Pandas
Applying user-defined functions to NumPy and Pandas
Adding more data to NumPy arrays and Pandas dataframes
Using Pandas to merge or lookup data
Sorting and ranking with Pandas
Using masks to filter data, and perform search and replace, in NumPy and Pandas
Summarising data by groups in Pandas using pivot_tables and groupby
Reshaping Pandas data with stack, unstack, pivot and melt
Subgrouping data in Pandas with groupby
Iterating through columns and rows in NumPy and Pandas
Removing duplicate data in NumPy and Pandas
Setting width and number of decimal places in NumPy print output
Using NumPy to generate random numbers, or shuffle arrays
Matplotlib for plotting charts
Simple xy line charts, and simple save to file
Scatter plot, and adding titles to axes
Pie charts, and adding a title
Histograms (and obtaining histogram data with NumPy)
3D wireframe and surface plots
Common modifications to charts
Adding contour lines to a heatmap
Statistics
Linear regression with scipy.stats
Linear regression with scikit learn
One sample t-test and Wilcoxon signed rank test
t-tests for testing the difference between two groups of data
Multi-comparison with Tukey’s test and the Holm-Bonferroni method
Multiple comparison of non-normally distributed data with the Kruskal-Wallace test
Confidence Interval for a single proportion
Clinical pathway simulation with SimPy
A simple bed occupancy model (object-based)
A hospital bed occupancy model with queuing for a limited number of beds (object based)
Machine Learning with Scikit Learn
Splitting data into training and test sets
Using logistic regression to diagnose breast cancer
Adding standard diagnostic performance metrics to a ml diagnosis model
How do you know if you have gathered enough data? By using learning rates.
Working with ordinal and categorical data
Choosing between models with stratified k-fold validation
Visualising accuracy and error in a classification model with a confusion matrix
Reducing data complexity, and eliminating covariance, with principal component analysis
Grouping unlabelled data with k-means clustering
Linear regression with scikit learn
Using free text for classification – ‘Bag of Words’
Worked machine learning example (for HSMA course)
Some common (and hopefully useful) algorithms
The travelling community nurse problem (aka the Travelling Salesman Problem)
Exploring the best possible trade-off between competing objectives: identifying the Pareto Front
Genetic algorithms 1. A simple genetic algorithm
Crowding distances: selecting solutions when too many multi-objective solutions exist
Miscellaneous Python
Speed up Python by 1,000 times or more using numba!
Resources
Open data travel times from all UK LSOA to all acute hospitals