Machine Learning with SciKit Learn

See also the notebooks using Titanic survival to teach classification with machine learning. These cover the essentials of machine learning classification, and include logistic regression. Random Forest, PyTorch and TensorFlow models. See here: https://pythonhealthcare.org/titanic-survival/

The iris data set

Splitting data into training and test sets

Feature Scaling

A short function to replace (impute) missing numerical data in Pandas DataFrames with median of column values

Using logistic regression to diagnose breast cancer

Adding standard diagnostic performance metrics to a ml diagnosis model

How do you know if you have gathered enough data? By using learning rates.

Working with ordinal and categorical data

Support Vector machines

Random Forests

Neural networks

Choosing between models with stratified k-fold validation

Optimising scikit-learn machine learning models with grid search or randomized search

Visualising accuracy and error in a classification model with a confusion matrix

Changing sensitivity of machine learning algorithms and performing a receiver-operator characteristic curve

Reducing data complexity, and eliminating covariance, with principal component analysis

Feature selection 1 (univariate statistical selection)

Feature selection 2 (model selection; forward selection)

Feature selection 3 (model selection; backwards selection)

Feature expansion

Grouping unlabelled data with k-means clustering

Linear regression with scikit learn

Random Forests regression (suitable for more complex data sets than linear regression)

Worked machine learning example (for HSMA course)

Simple machine learning model to predict emergency department (ED) breaches of the four-hour target

Oversampling to correct for imbalanced data using naive sampling or SMOTE

Regression analysis with TensorFlow