When do we use it? When there are:

a) Differences between conditions

b) One variable

c) Two conditions

d) Unrelated-design

We will code it like this: MD12Uor. The “or” comes from ordinal data and means that Mann Whitney test is non-parametric.

Our hypothesis: participants will memorise a text  better with a previously given title (Condition 2) than participants memorising the same text without a previously given title (Condition 1).
So we not only predict diferences among the means in the two conditions. We also predict that the differences will be in a way such that Condition 2  will have higher mean scores than  Condition 1.

Our raw dataset would look like the following table.

 Condition 1 Condition 2 3 9 4 7 2 5 6 10 2 6 5 8 5 7

Unlike in the Wilcoxon test, in the Mann Whitney test we cannot calculate differences between conditions because they are not related, instead we calculate an overall rank and give each score in the condition its corresponding position in the overall rank. And we sum up the ranks for each condition. We need to calculate from the above data:

• Means in Condition 1
• Means in Condition 2
• Overall ranks in Condition 1
• Total ranks in Condition 1
• Overall ranks in Condition 2
• Total ranks in Condition 2

The bolded bits are automatically calculated from the given information.

 Condition 1 Overall ranksCondition 1 Condition 2 Overall ranksCondition 2 3 3 9 13 4 4 7 10.5 2 1.5 5 6 6 8.5 10 14 2 1.5 6 8.5 5 6 8 12 5 6 7 10.5 Means 3.67 T1 = 24.5 Means 7.13 T2 = 80.5

Unlike in the Wilcoxon test, in the Mann Whitney test we cannot calculate differences between conditions because they are not related, instead we calculate an overall rank and give each score in the condition its corresponding positions in the overall rank. And we sum up the ranks for each condition.

Rationale of the Mann Whitney test

The Mann Whitney test compares the rank totals between the two conditions. If the ranked differences between the conditions are random (as claimed by the null hypothesis), the rank differences should be minimal. If there is a significant difference between rank totals in the predicted direction (Condition 2 > Condition 1), the null hypothesis could be rejected.

Calculating U and U’

N1: Sample Size (N) of Condition 1

N2: Sample Size (N) of Condition 2

The U Table

The U Table enables you to check whether given your one/two tailed hypothesis, your U value and your sample size, the probability that the differences found between conditions were likely to occur by chance. Reminder:  the significance levels range from 5% to 1%.

Our values are: U = 3.5, N1 = 6, N2 = 8 and hypothesis = one-tailed.

We check the on-tailed table and we check against our n1 and n2 values in the .05 row. The critical value is for U with our sample size is 10. Our U is less than 10 so the probability that the differences found between conditions can occur due to chance is less than 5%, this enables us to claim that the differences are statistically significant and thus we can reject the null hypothesis. Now we can look at the .01 row and we see that our U value is less than the critical value for 0.01 (6) so this gives us stronger statistical significance and we can claim that the probability that the differences found between conditions can occur due to chance is less than 1% (better than the 5% we initially found).

Reminder: always check that the differences are in the direction predicted by the one-tailed hypothesis. In our case, the data shows that participants scored higher in Condition 2 (title) than they did in Condition 1 (no title) and the differences are statistically significant (p < 0.01).