Forfatter av avsnitt: Danielle J. Navarro and David R. Foxcroft

Teststatistikk og utvalgsfordelinger (sampling distributions)

At this point we need to start talking specifics about how a hypothesis test is constructed. To that end, let us return to the ESP example. Let us ignore the actual data that we obtained, for the moment, and think about the structure of the experiment. Regardless of what the actual numbers are, the form of the data is that X out of N people correctly identified the colour of the hidden card. Moreover, let us suppose for the moment that the null hypothesis really is true, that ESP does not exist and the true probability that anyone picks the correct colour is exactly θ = 0.5. What would we expect the data to look like? Well, obviously we would expect the proportion of people who make the correct response to be pretty close to 50%. Or, to phrase this in more mathematical terms, we would say that X / N is approximately 0.5. Of course, we would not expect this fraction to be exactly 0.5. If, for example, we tested N = 100 people and X = 53 of them got the question right, we would probably be forced to concede that the data are quite consistent with the null hypothesis. On the other hand, if X = 99 of our participants got the question right then we would feel pretty confident that the null hypothesis is wrong. Similarly, if only X = 3 people got the answer right we would be similarly confident that the null hypothesis was wrong. Let us be a little more technical about this. We have a quantity X that we can calculate by looking at our data. After looking at the value of X we make a decision about whether to believe that the null hypothesis is correct, or to reject the null hypothesis in favour of the alternative. The name for this thing that we calculate to guide our choices is a test statistic.

Når vi har valgt en teststatistikk, er neste trinn å angi nøyaktig hvilke verdier av teststatistikken som ville fått oss til å forkaste nullhypotesen, og hvilke verdier som ville fått oss til å beholde den. For å gjøre dette må vi finne ut hva utvalgsfordelingen til teststatistikken ville vært hvis nullhypotesen faktisk var sann (vi snakket om utvalgsfordelinger tidligere i Utvalgsfordeling av gjennomsnittet). Hvorfor trenger vi dette? Fordi denne fordelingen forteller oss nøyaktig hvilke verdier av X nullhypotesen vår ville føre til at vi forventer. Og derfor kan vi bruke denne fordelingen som et verktøy for å vurdere hvor godt nullhypotesen stemmer overens med dataene våre.

Utvalgsfordeling når nullhypotesen er sann

Fig. 81 Utvalgsfordelingen for teststatistikken X når nullhypotesen er sann. For vårt ESP-scenario er dette en binomisk fordeling. Ikke overraskende, siden nullhypotesen sier at sannsynligheten for et riktig svar er θ = 0,5, sier utvalgsfordelingen at den mest sannsynlige verdien er 50 (av 100) riktige svar. Mesteparten av sannsynlighetsmassen ligger mellom 40 og 60.

How do we actually determine the sampling distribution of the test statistic? For a lot of hypothesis tests this step is actually quite complicated, and later on in the book you will see me being slightly evasive about it for some of the tests (some of them I do not even understand myself). However, sometimes it is very easy. And, fortunately for us, our ESP example provides us with one of the easiest cases. Our population parameter θ is just the overall probability that people respond correctly when asked the question, and our test statistic X is the count of the number of people who did so out of a sample size of N. We have seen a distribution like this before, in section Binomialfordelingen, and that is exactly what the binomial distribution describes! So, to use the notation and terminology that I introduced in that section, we would say that the null hypothesis predicts that X is binomially distributed, which is written:

X ~ Binomial(θ, N)

Since the null hypothesis states that θ = 0.5 and our experiment has N = 100 people, we have the sampling distribution we need. This sampling distribution is plotted in Fig. 81. No surprises really, the null hypothesis says that X = 50 is the most likely outcome, and it says that we are almost certain to see somewhere between 40 and 60 correct responses.