When do we use it? When there are:

a) Relationships between variables

b) Nominal categories

c) Unrelated-design

We will code it like this: CRNU.

Unlike the previous statistical tests, Chi-Square deals with categories, so the measurement of participants’ actions is represented in the categories they are allocated to.

Nominal data (also known as categorical data)

The relevant measure is the number of participants in each category. Participants are said to be tested on both variables. See example below:

Variable 1: Giving hints

Category 1 –  Giving a hint

Category 2 –  Not giving a hint

Variable 2: Solving problems

Category 1 – Solving a problem

Category 2 – Not solving a problem

So participants would either be given a hint (Category 1) or not (Category 2) and then, they would be given a problem to solve. Some will solve it (and fall in Category 1) and some will not (Category 2). With this setup, the researcher can make predictions about how do the two variables relate. So predictions are made about the number of participants that will fall in every category.

Reminder: for every variable, a participant can only be inside one category.

Hypothesis: a higher proportion of social science students would get 1sts in their final degree classification, compared to natural science students.

A questionnaire would be sent to 50 NatSci students and 50 SocSci students where they would be asked about their degree classifications. The answers fall in 3 categories: 1st, 2:1s and 2:2s. There were 44 replies from SocSci students and 42 from NatSci students.

Our raw dataset would look like this:

 Degree classifications 1st 2:1 2:2 Total students SocSci 6 15 23 44 NatSci 10 8 24 42 Total degree classifications 16 23 47 86

Rationale of the Chi-Square test

The number of NatSci students in category of 2:1 is the frequency of NatSci students in 2:1s and it is called frequency. So 2:1 NatSci students have a frequency of 8 (out of 42). These frequencies are called observed frequency, as opposed to expected frequencies which represent the null hypothesis (i.e. no significant difference in the number of NatSci and SocSci students getting 1sts.

If the expected frequencies are significantly different than the observed frequencies, the null hypothesis can be rejected.

The calculation of the Chi-Square test will be soon uploaded as a scanned page with the step-by-step calculations.

The Chi-Square Table

The Chi-Square Critical Values Table (see here) enables you to check whether given your hypothesis, your table of frequencies, your Chi-Square value and your df (degrees of freedom), the probability that the differences found between conditions were likely to occur by chance.

We open up the table and we check against our df and our Chi-Square value (3.11). Our value has to be equal or larger than the critical value. The critical value with our df is 5.991. Our value is not larger than 5.99 so the probability that the differences found between conditions can occur due to chance is more than 5%, this does not enable us to claim that the differences are statistically significant and thus we cannot reject the null hypothesis.

Remainder: the Chi-Square test can only be used for two-tailed hypothesis. Even though, the hypothesis given in the beginning sounds like one-tailed hypothesis, it is not. This is because the hypothesis given can be interpreted in more than one way: those more likely to get 1sts decide to study social sciences or those who study social sciences are more hard-working than NatSci students and more SocSci students end up getting more 1sts.