Section author: Danielle J. Navarro and David R. Foxcroft

Categorical data analysis

Now that we’ve covered the basic theory behind hypothesis testing, it’s time to start looking at specific tests that are commonly used in psychology. So where should we start? Not every textbook agrees on where to start, but I’m going to start with “χ² tests” (“Categorical data analysis”, this chapter) and “t-tests” (chapter Comparing two means). Both of these tools are very frequently used in scientific practice, and whilst they’re not as powerful as “regression” (chapter Correlation and linear regression) and “Analysis of Variance” (chapters Comparing several means (one-way ANOVA) and Factorial ANOVA) they’re much easier to understand. Finally, there is Factor analysis that aims to describe the variability among observed, correlated variables in terms of a lower number of unobserved variables called factors or latent Variables.

The term “categorical data” in the title of this chapter is just another name for “nominal scale data” nominal. It’s nothing that we haven’t already discussed, it’s just that in the context of data analysis people tend to use the term “categorical data” rather than “nominal scale data”. I don’t know why. In any case, categorical data analysis refers to a collection of tools that you can use when your data are nominal scale nominal. Those tools are often called “χ² tests” (pronounced “chi-square”, sometimes “chi-squared”). They determine whether there is a statistically significant difference between expected and observed frequencies and whether the observations follows a χ² frequency distribution. However, there are a lot of different tools that can be used for categorical data analysis, and this chapter covers only a few of the more common ones.