56. Statistics: Multiple comparison of non-normally distributed data with the Kruskal-Wallace test

For data that is not normally distributed, the equivalent test to the ANOVA test (for normally distributed data) is the Kruskal-Wallace test. This tests whether all groups are likely to be from the same population. Continue reading “56. Statistics: Multiple comparison of non-normally distributed data with the Kruskal-Wallace test”

55. Statistics: Multi-comparison with Tukey’s test and the Holm-Bonferroni method

If an ANOVA test has identified that not all groups belong to the same population, then methods may be used to identify which groups are significantly different to each other.

Below are two commonly used methods: Tukey’s and Holm-Bonferroni.

These two methods assume that data is approximately normally distributed. Continue reading “55. Statistics: Multi-comparison with Tukey’s test and the Holm-Bonferroni method”

47. Linear regression with scipy.stats

%matplotlib inline

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

# Set up x any arrays

x=np.array([1,2,3,4,5,6,7,8,9,10])
y=np.array([2.3,4.5,5.0,8,11.1,10.9,13.9,15.4,18.2,19.5])
y=y+10

# scipy linear regression

gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y)

# Calculated fitted y

y_fit=intercept + (x*gradient)

# Plot data

plt.plot(x, y, 'o', label='original data')
plt.plot(x, y_fit, 'r', label='fitted line')

# Add text box and legend

text='Intercept: %.1f\nslope: %.2f\nR-square: %.3f' %(intercept,gradient,r_value**2)
plt.text(6,15,text)
plt.legend()

# Display plot

plt.show()plot_19Linear regression with scipy.stats