For data that is not normally distributed, the equivalent test to the ANOVA test (for normally distributed data) is the Kruskal-Wallace test. This tests whether all groups are likely to be from the same population. Continue reading “56. Statistics: Multiple comparison of non-normally distributed data with the Kruskal-Wallace test”
Tag: Scipy
55. Statistics: Multi-comparison with Tukey’s test and the Holm-Bonferroni method
If an ANOVA test has identified that not all groups belong to the same population, then methods may be used to identify which groups are significantly different to each other.
Below are two commonly used methods: Tukey’s and Holm-Bonferroni.
These two methods assume that data is approximately normally distributed. Continue reading “55. Statistics: Multi-comparison with Tukey’s test and the Holm-Bonferroni method”
54. Statistics: Analysis of variance (ANOVA)
One way analysis of variance (ANOVA) tests whether multiple groups all belong to the same population or not.
If a conclusion is reached that the groups do not all belong to the same population, further tests may be utilised to identify the differences. Continue reading “54. Statistics: Analysis of variance (ANOVA)”
53. Statistics: Mann Whitney U-test
The Mann-Whitney U test allows comparison of two groups of data where the data is not normally distributed. Continue reading “53. Statistics: Mann Whitney U-test”
49. Statistics: t-tests for testing the difference between two groups of data
t-tests are ideally suited to groups of data that are normally distributed.
Unpaired t-test
Statistical test for testing the difference between independent groups (e.g. measure the weight of men and women). Continue reading “49. Statistics: t-tests for testing the difference between two groups of data”
48. Statistics: One sample t-test and Wilcoxon signed rank test
The following test for a difference between the centre of a sample of data and a given reference point. The one sample t-test assumes normally distributed data, whereas the Wilcoxon signed rank test can be used with any data. Continue reading “48. Statistics: One sample t-test and Wilcoxon signed rank test”
47. Linear regression with scipy.stats
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
# Set up x any arrays
x=np.array([1,2,3,4,5,6,7,8,9,10])
y=np.array([2.3,4.5,5.0,8,11.1,10.9,13.9,15.4,18.2,19.5])
y=y+10
# scipy linear regression
gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y)
# Calculated fitted y
y_fit=intercept + (x*gradient)
# Plot data
plt.plot(x, y, 'o', label='original data')
plt.plot(x, y_fit, 'r', label='fitted line')
# Add text box and legend
text='Intercept: %.1f\nslope: %.2f\nR-square: %.3f' %(intercept,gradient,r_value**2)
plt.text(6,15,text)
plt.legend()
# Display plot
plt.show()
Linear regression with scipy.stats