For those unfamiliar with statistics, it
can often be confusing when deciding which test to apply to analyse data in
order to determine whether changes observed are indeed statistically
significant (i.e. p<0.05)
The following will provide some guidance on
how to analyse data between two groups (e.g. placebo vs drug treatment, normal
vs diseased, light vs dark, etc).
What
Are The Differences?
The t-test is a test between population
means. They are parametric tests and should only be applied to data that is
normally distributed. In contrast, the Mann-Whitney U (MWU) test is a test of
differences in medians as well as the shape and spread of the data. It is a
non-parametric test that can be used as an alternative to the t-test on data
that is not normally distributed. However, it should be noted that the MWU test
can be applied to normally distributed data.
The number of data points or sample size
can also affect the choice in tests. If you have large datasets that have a
normal distribution, a t-test can be very powerful. But if you have a small
number of data points (e.g. less than 6 data points), a MWU test would be
preferable to a t-test since the data is unlikely to have a normal
distribution.
What
Do I Use?
To determine which test to apply, you will
first need to establish whether your experimental data is normally distributed.
For illustrative purposes, a mock dataset (below) will be used. The dataset
below is from two groups (normal vs diseased). The sample type is coded and
blind to the experimenter.
There are a number of normality tests for
you to choose from, but if you have a set of data with less than 2000 data
points, as above, try using the Shapiro-Wilk normality test. The null
hypothesis is that your data points belong to a normal distribution;
reciprocally, your alternative hypothesis is that your data points do not
belong to a normal distribution. If p<0.05, your data does not have a normal
distribution. For the above dataset, the results of running the Shapiro-Wilk
test is as follows:
n = 24
Mean = 66.33333333333333
SD = 21.25142791451429
W = 0.9437512788610762
Threshold (p=0.01) = 0.8840000033378601
Threshold (p=0.05) = 0.9160000085830688
Threshold (p=0.10) = 0.9300000071525574
Here, p>0.05 at all thresholds. Accordingly,
the alternative hypothesis is rejected and we conclude that the data points
have a normal distribution.
** The results above were calculated using
an online Shapiro-Wilk calculator.
For sample sizes greater than 2000, you can
use the Kolmogorov-Smirnov test. If you prefer to visualize the data in
graphical form, try using a normal probability plot or quantile-quantile plot.
Other
Ways To Test Normality
Aside from the normality tests mentioned
above, there are other normality tests available. For their description, please
refer to the following link.
No comments:
Post a Comment