*Section author: Danielle J. Navarro and David R. Foxcroft*

# Summary¶

Calculating some basic descriptive statistics is one of the very first things you do when analysing real data, and descriptive statistics are much simpler to understand than inferential statistics, so like every other statistics textbook I’ve started with descriptives. In this chapter, we talked about the following topics:

- Measures of central tendency: Broadly speaking, central tendency measures tell you where the data are. There’s three measures that are typically reported in the literature: the mean, median and mode.
- Measures of variability: In contrast, measures of variability tell you about how “spread out” the data are. The key measures are: range, standard deviation, and interquartile range.
- Skew and kurtosis: We also looked at assymetry in a variable’s distribution (skew) and pointness (kurtosis).
- Getting group summaries of variables in jamovi: Since this book focuses on doing data analysis in jamovi, we spent a bit of time talking about how descriptive statistics are computed for different subgroups.
- Standard scores: The
*z*-score is a slightly unusual beast. It’s not quite a descriptive statistic, and not quite an inference. Make sure you understand that section. It’ll come up again later.

In the next chapter we’ll move on to a discussion of how to draw pictures! Everyone loves a pretty picture, right? But before we do, I want to end on an important point. A traditional first course in statistics spends only a small proportion of the class on descriptive statistics, maybe one or two lectures at most. The vast majority of the lecturer’s time is spent on inferential statistics because that’s where all the hard stuff is. That makes sense, but it hides the practical everyday importance of choosing good descriptives. With that in mind…

# Epilogue: Good descriptive statistics are descriptive!¶

The death of one man is a tragedy.The death of millions is a statistic.—Josef Stalin, Potsdam 1945

950,000 – 1,200,000—Estimate of Soviet repression deaths, 1937-1938 (Ellman, 2002)

Stalin’s infamous quote about the statistical character of the deaths of millions is worth giving some thought. The clear intent of his statement is that the death of an individual touches us personally and its force cannot be denied, but that the deaths of a multitude are incomprehensible and as a consequence are mere statistics and more easily ignored. I’d argue that Stalin was half right. A statistic is an abstraction, a description of events beyond our personal experience, and so hard to visualise. Few if any of us can imagine what the deaths of millions is “really” like, but we can imagine one death and this gives the lone death its feeling of immediate tragedy, a feeling that is missing from Ellman’s cold statistical description.

Yet it is not so simple. Without numbers, without counts, without a
description of what happened, we have *no chance* of understanding what
really happened, no opportunity even to try to summon the missing
feeling. And in truth, as I write this sitting in comfort on a Saturday
morning half a world and a whole lifetime away from the Gulags, when I
put the Ellman estimate next to the Stalin quote a dull dread settles in
my stomach and a chill settles over me. The Stalinist repression is
something truly beyond my experience, but with a combination of
statistical data and those recorded personal histories that have come
down to us, it is not entirely beyond my comprehension. Because what
Ellman’s numbers tell us is this: over a two year period Stalinist
repression wiped out the equivalent of every man, woman and child
currently alive in the city where I live. Each one of those deaths had
it’s own story, was it’s own tragedy, and only some of those are known
to us now. Even so, with a few carefully chosen statistics, the scale of
the atrocity starts to come into focus.

Thus it is no small thing to say that the first task of the statistician
and the scientist is to summarise the data, to find some collection of
numbers that can convey to an audience a sense of what has happened.
This is the job of descriptive statistics, but it’s not a job that can
be told solely using the numbers. You are a data analyst, and not a
statistical software package. Part of your job is to take these
*statistics* and turn them into a *description*. When you analyse data
it is not sufficient to list off a collection of numbers. Always
remember that what you’re really trying to do is communicate with a
human audience. The numbers are important, but they need to be put
together into a meaningful story that your audience can interpret. That
means you need to think about framing. You need to think about context.
And you need to think about the individual events that your statistics
are summarising.