*Section author: Danielle J. Navarro and David R. Foxcroft*

# Scales of measurement and types of variables¶

As the previous section indicates, the outcome of a psychological
measurement is called a variable. But not all variables are of the same
qualitative type and so it’s useful to understand what types there are.
A very useful concept for distinguishing between different types of
variables is what’s known as **scales of measurement**.

## Nominal scale¶

A **nominal scale** variable (also referred to as a **categorical**
variable) is one in which there is no particular relationship between
the different possibilities. For these kinds of variables it doesn’t
make any sense to say that one of them is “bigger’ or “better” than any
other one, and it absolutely doesn’t make any sense to average them. The
classic example for this is “eye colour”. Eyes can be blue, green or
brown, amongst other possibilities, but none of them is any “bigger”
than any other one. As a result, it would feel really weird to talk
about an “average eye colour”. Similarly, gender is nominal too: male
isn’t better or worse than female. Neither does it make sense to try to
talk about an “average gender”. In short, nominal scale variables are
those for which the only thing you can say about the different
possibilities is that they are different. That’s it.

Let’s take a slightly closer look at this. Suppose I was doing research on how people commute to and from work. One variable I would have to measure would be what kind of transportation people use to get to work. This “transport type” variable could have quite a few possible values, including: “train”, “bus”, “car”, “bicycle”. For now, let’s suppose that these four are the only possibilities. Then imagine that I ask 100 people how they got to work today, with this result:

Transportation | Number of people |
---|---|

Train | 12 |

Bus | 30 |

Car | 48 |

Bicycle | 10 |

So, what’s the average transportation type? Obviously, the answer here is that there isn’t one. It’s a silly question to ask. You can say that travel by car is the most popular method, and travel by train is the least popular method, but that’s about all. Similarly, notice that the order in which I list the options isn’t very interesting. I could have chosen to display the data like this…

Transportation | Number of people |
---|---|

Car | 48 |

Train | 12 |

Bicycle | 10 |

Bus | 30 |

…and nothing really changes.

## Ordinal scale¶

**Ordinal scale** variables have a bit more structure than nominal scale
variables, but not by a lot. An ordinal scale variable is one in which
there is a natural, meaningful way to order the different possibilities,
but you can’t do anything else. The usual example given of an ordinal
variable is “finishing position in a race”. You *can* say that the
person who finished first was faster than the person who finished
second, but you *don’t* know how much faster. As a consequence we know
that 1st > 2nd, and we know that 2nd > 3rd, but the difference between
1st and 2nd might be much larger than the difference between 2nd and 3rd.

Here’s a more psychologically interesting example. Suppose I’m interested in people’s attitudes to climate change. I then go and ask some people to pick the statement (from four listed statements) that most closely matches their beliefs:

(1) Temperatures are rising because of human activity(2) Temperatures are rising but we don’t know why(3) Temperatures are rising but not because of humans(4) Temperatures are not rising

Notice that these four statements actually do have a natural ordering, in terms of “the extent to which they agree with the current science”. Statement 1 is a close match, statement 2 is a reasonable match, statement 3 isn’t a very good match, and statement 4 is in strong opposition to current science. So, in terms of the thing I’m interested in (the extent to which people endorse the science), I can order the items as 1 > 2 > 3 > 4. Since this ordering exists, it would be very weird to list the options like this…

(3) Temperatures are rising but not because of humans(1) Temperatures are rising because of human activity(4) Temperatures are not rising(2) Temperatures are rising but we don’t know why

…because it seems to violate the natural “structure” to the question.

So, let’s suppose I asked 100 people these questions, and got the following answers:

Response | Number |
---|---|

Temperatures are rising because of human activity (1) | 51 |

Temperatures are rising but we don’t know why (2) | 20 |

Temperatures are rising but not because of humans (3) | 10 |

Temperatures are not rising (4) | 19 |

When analysing these data it seems quite reasonable to try to group (1),
(2) and (3) together, and say that 81 out of 100 people were willing to
*at least partially* endorse the science. And it’s *also* quite
reasonable to group (2), (3) and (4) together and say that 49 out of 100
people registered *at least some disagreement* with the dominant
scientific view. However, it would be entirely bizarre to try to group
(1), (2) and (4) together and say that 90 out of 100 people said… what?
There’s nothing sensible that allows you to group those responses
together at all.

That said, notice that while we *can* use the natural ordering of these
items to construct sensible groupings, what we *can’t* do is average
them. For instance, in my simple example here, the “average” response to
the question is 1.97. If you can tell me what that means I’d love to
know, because it seems like gibberish to me!

## Interval scale¶

In contrast to nominal and ordinal scale variables, **interval scale**
and ratio scale variables are variables for which the numerical value is
genuinely meaningful. In the case of interval scale variables the
*differences* between the numbers are interpretable, but the variable
doesn’t have a “natural” zero value. A good example of an interval scale
variable is measuring temperature in degrees celsius. For instance, if
it was 15° yesterday and 18° today, then the 3° difference between the two
is genuinely meaningful. Moreover, that 3° difference is *exactly the same*
as the 3° difference between 7° and 10°. In short, addition and subtraction
are meaningful for interval scale variables.[1]

However, notice that the 0° does not mean “no temperature at all”. It actually
means “the temperature at which water freezes”, which is pretty arbitrary. As
a consequence it becomes pointless to try to multiply and divide temperatures.
It is wrong to say that 20° is *twice as hot* as 10°, just as it is weird and
meaningless to try to claim that 20° is negative two times as hot as -10°.

Again, lets look at a more psychological example. Suppose I’m interested in looking at how the attitudes of first-year university students have changed over time. Obviously, I’m going to want to record the year in which each student started. This is an interval scale variable. A student who started in 2003 did arrive 5 years before a student who started in 2008. However, it would be completely daft for me to divide 2008 by 2003 and say that the second student started “1.0024 times later” than the first one. That doesn’t make any sense at all.

## Ratio scale¶

The fourth and final type of variable to consider is a **ratio scale**
variable, in which zero really means zero, and it’s okay to multiply and
divide. A good psychological example of a ratio scale variable is
response time (RT). In a lot of tasks it’s very common to record the
amount of time somebody takes to solve a problem or answer a question,
because it’s an indicator of how difficult the task is. Suppose that
Alan takes 2.3 seconds to respond to a question, whereas Ben takes 3.1
seconds. As with an interval scale variable, addition and subtraction
are both meaningful here. Ben really did take 3.1 - 2.3 = 0.8 seconds
longer than Alan did. However, notice that multiplication and division
also make sense here too: Ben took 3.1 / 2.3 = 1.35 times as long as
Alan did to answer the question. And the reason why you can do this is
that for a ratio scale variable such as RT “zero seconds” really does
mean “no time at all”.

## Continuous versus discrete variables¶

There’s a second kind of distinction that you need to be aware of, regarding what types of variables you can run into. This is the distinction between continuous variables and discrete variables. The difference between these is as follows:

- A
**continuous variable**is one in which, for any two values that you can think of, it’s always logically possible to have another value in between. - A
**discrete variable**is, in effect, a variable that isn’t continuous. For a discrete variable it’s sometimes the case that there’s nothing in the middle.

These definitions probably seem a bit abstract, but they’re pretty simple once you see some examples. For instance, response time is continuous. If Alan takes 3.1 seconds and Ben takes 2.3 seconds to respond to a question, then Cameron’s response time will lie in between if he took 3.0 seconds. And of course it would also be possible for David to take 3.031 seconds to respond, meaning that his RT would lie in between Cameron’s and Alan’s. And while in practice it might be impossible to measure RT that precisely, it’s certainly possible in principle. Because we can always find a new value for RT in between any two other ones we regard RT as a continuous measure.

Discrete variables occur when this rule is violated. For example, nominal scale variables are always discrete. There isn’t a type of transportation that falls “in between” trains and bicycles, not in the strict mathematical way that 2.3 falls in between 2 and 3. So transportation type is discrete. Similarly, ordinal scale variables are always discrete. Although “2nd place” does fall between “1st place” and “3rd place”, there’s nothing that can logically fall in between “1st place” and “2nd place”. Interval scale and ratio scale variables can go either way. As we saw above, response time (a ratio scale variable) is continuous. Temperature in degrees celsius (an interval scale variable) is also continuous. However, the year you went to school (an interval scale variable) is discrete. There’s no year in between 2002 and 2003. The number of questions you get right on a true-or-false test (a ratio scale variable) is also discrete. Since a true-or-false question doesn’t allow you to be “partially correct”, there’s nothing in between 5/10 and 6/10. The relationship between the scales of measurement and the discrete / continuity distinction is summarized in Table 1. Cells with a tick mark correspond to things that are possible. I’m trying to hammer this point home, because (a) some textbooks get this wrong, and (b) people very often say things like “discrete variable” when they mean “nominal scale variable”. It’s very unfortunate.

continuous | discrete | |
---|---|---|

nominal |
✓ | |

ordinal |
✓ | |

interval |
✓ | ✓ |

ratio |
✓ | ✓ |

## Some complexities¶

Okay, I know you’re going to be shocked to hear this, but the real world is much messier than this little classification scheme suggests. Very few variables in real life actually fall into these nice neat categories, so you need to be kind of careful not to treat the scales of measurement as if they were hard and fast rules. It doesn’t work like that. They’re guidelines, intended to help you think about the situations in which you should treat different variables differently. Nothing more.

So let’s take a classic example, maybe *the* classic example, of a
psychological measurement tool: the **Likert scale**. The humble Likert
scale is the bread and butter tool of all survey design. You yourself
have filled out hundreds, maybe thousands, of them and odds are you’ve
even used one yourself. Suppose we have a survey question that looks
like this:

Which of the following best describes your opinion of the statement that “all pirates are freaking awesome”?

and then the options presented to the participant are these:

(1) Strongly disagree(2) Disagree(3) Neither agree nor disagree(4) Agree(5) Strongly agree

This set of items is an example of a 5-point Likert scale, in which people are asked to choose among one of several (in this case 5) clearly ordered possibilities, generally with a verbal descriptor given in each case. However, it’s not necessary that all items are explicitly described. This is a perfectly good example of a 5-point Likert scale too:

(1) Strongly disagree(2)(3)(4)(5) Strongly agree

Likert scales are very handy, if somewhat limited, tools. The question is what kind of variable are they? They’re obviously discrete, since you can’t give a response of 2.5. They’re obviously not nominal scale, since the items are ordered; and they’re not ratio scale either, since there’s no natural zero.

But are they ordinal scale or interval scale? One argument says that we
can’t really prove that the difference between “strongly agree” and
“agree” is of the same size as the difference between “agree” and
“neither agree nor disagree”. In fact, in everyday life it’s pretty
obvious that they’re not the same at all. So this suggests that we ought
to treat Likert scales as ordinal variables. On the other hand, in
practice most participants do seem to take the whole “on a scale from 1
to 5” part fairly seriously, and they tend to act as if the differences
between the five response options were fairly similar to one another. As
a consequence, a lot of researchers treat Likert scale data as interval
scale.[2] It’s not interval scale, but in practice it’s close enough
that we usually think of it as being **quasi-interval scale**.

[1] | Actually, I’ve been informed by readers with greater physics knowledge than I that temperature isn’t strictly an interval scale, in the sense that the amount of energy required to heat something up by 3° depends on it’s current temperature. So in the sense that physicists care about, temperature isn’t actually an interval scale. But it still makes a cute example so I’m going to ignore this little inconvenient truth. |

[2] | Ah, psychology… never an easy answer to anything! |