Forfatter av avsnitt: Danielle J. Navarro and David R. Foxcroft

Bayesianske hypotesetester

In chapter Hypotesetesting, I described the orthodox approach to hypothesis testing. It took an entire chapter to describe, because null hypothesis testing is a very elaborate contraption that people find very hard to make sense of. In contrast, the Bayesian approach to hypothesis testing is incredibly simple. Let us pick a setting that is closely analogous to the orthodox scenario. There are two hypotheses that we want to compare, a null hypothesis h0 and an alternative hypothesis h1. Prior to running the experiment we have some beliefs P(h) about which hypotheses are true. We run an experiment and obtain data d. Unlike frequentist statistics, Bayesian statistics does allow us to talk about the probability that the null hypothesis is true. Better yet, it allows us to calculate the posterior probability of the null hypothesis, using Bayes’ rule:

\[P(h_0 | d) = \frac{P(d|h_0) P(h_0)}{P(d)}\]

Denne formelen forteller oss nøyaktig hvor stor tiltro vi bør ha til nullhypotesen etter å ha observert dataene d. På samme måte kan vi regne ut hvor stor overbevisning vi skal ha på alternativhypotesen ved hjelp av den samme formelen. Alt vi trenger å gjøre, er å endre fortegnet:

\[P(h_1 | d) = \frac{P(d|h_1) P(h_1)}{P(d)}\]

It is all so simple that I feel like an idiot even bothering to write these equations down, since all I am doing is copying Bayes rule from the previous section.[1]

Bayes-faktoren

In practice, most Bayesian data analysts tend not to talk in terms of the raw posterior probabilities P(h0|*d*) and P(h1|*d*). Instead, we tend to talk in terms of the posterior odds ratio. Think of it like betting. Suppose, for instance, the posterior probability of the null hypothesis is 25%, and the posterior probability of the alternative is 75%. The alternative hypothesis is three times as probable as the null, so we say that the odds are 3:1 in favour of the alternative. Mathematically, all we have to do to calculate the posterior odds is divide one posterior probability by the other:

\[\frac{P(h_1 | d)}{P(h_0 | d)} = \frac{0.75}{0.25} = 3\]

Or, to write the same thing in terms of the equations above:

\[\frac{P(h_1 | d)}{P(h_0 | d)} = \frac{P(d|h_1)}{P(d|h_0)} \times \frac{P(h_1)}{P(h_0)}\]

Actually, this equation is worth expanding on. There are three different terms here that you should know. On the left-hand side, we have the posterior odds, which tells you what you believe about the relative plausibilty of the null hypothesis and the alternative hypothesis after seeing the data. On the right-hand side, we have the prior odds, which indicates what you thought before seeing the data. In the middle, we have the Bayes factor, which describes the amount of evidence provided by the data:

\[\begin{split}\begin{array}{ccccc}\displaystyle \frac{P(h_1 | d)}{P(h_0 | d)} & = & \displaystyle\frac{P(d|h_1)}{P(d|h_0)} & \times & \displaystyle\frac{P(h_1)}{P(h_0)} \\[6pt] \\[-2pt] \uparrow & ~ & \uparrow & ~ & \uparrow \\[6pt] \mbox{etterfølgende sannsynlighet} & ~ & \mbox{Bayes-faktor} & ~ & \mbox{forutgående sannsynlighet} \\ \end{array}\end{split}\]

Bayes-faktoren (noen ganger forkortet BF) har en spesiell plass i Bayesiansk hypotesetesting, fordi den har en lignende rolle som p-verdien i ortodoks hypotesetesting. Bayes-faktoren kvantifiserer styrken på bevisene som dataene gir, og det er derfor Bayes-faktoren som folk pleier å rapportere når de kjører en Bayesiansk hypotesetest. Grunnen til at man rapporterer Bayes-faktorer i stedet for den etterfølgende sannsynligheten, er at ulike forskere vil ha ulike forutgående sannsynlighetene. Noen kan ha en sterk tilbøyelighet til å tro at nullhypotesen er sann, mens andre kan ha en sterk tilbøyelighet til å tro at den er usann. Derfor er det mest høflige en anvendt forsker kan gjøre å rapportere Bayes-faktoren. På den måten kan alle som leser artikkelen, multiplisere Bayes-faktoren med sine egne personlige forutgående sannsynligheten, og de kan selv regne ut hva de etterfølgende sannsynlighetene vil være. I alle fall pleier vi å late som om vi tar like mye hensyn til nullhypotesen som til alternativet, og i så fall blir den forutgående sannsynligheten lik 1, og den etterfølgende sannsynligheten blir det samme som Bayes-faktoren.

Tolkning av Bayes-faktorer

One of the really nice things about the Bayes factor is the numbers are inherently meaningful. If you run an experiment and you compute a Bayes factor of 4, it means that the evidence provided by your data corresponds to betting odds of 4:1 in favour of the alternative. However, there have been some attempts to quantify the standards of evidence that would be considered meaningful in a scientific context. The two most widely used are from Jeffreys (1961) and Kass and Raftery (1995). Of the two, I tend to prefer the Kass and Raftery (1995) table because it is a bit more conservative. So here it is:

Bayes-faktor

Tolkning

1 – 3

Ubetydelige bevis

3 – 20

Positive bevis

20 – 150

Sterke bevis

> 150

Svært sterke bevis

And to be perfectly honest, I think that even the Kass and Raftery (1995) standards are being a bit charitable. If it were up to me, I would have called the “positive evidence” category “weak evidence”. To me, anything in the range 3:1 to 20:1 is “weak” or “modest” evidence at best. But there are no hard and fast rules here. What counts as strong or weak evidence depends entirely on how conservative you are and upon the standards that your community insists upon before it is willing to label a finding as “true”.

In any case, note that all the numbers listed above make sense if the Bayes factor is greater than 1 (i.e., the evidence favours the alternative hypothesis). However, one big practical advantage of the Bayesian approach relative to the orthodox approach is that it also allows you to quantify evidence for the null. When that happens, the Bayes factor will be less than 1. You can choose to report a Bayes factor less than 1, but to be honest I find it confusing. For example, suppose that the likelihood of the data under the null hypothesis P(d*|*h0) is equal to 0.2, and the corresponding likelihood P(d*|*h1) under the alternative hypothesis is 0.1. Using the equations given above, Bayes factor here would be:

\[\mbox{BF} = \frac{P(d|h_1)}{P(d|h_0)} = \frac{0.1}{0.2} = 0.5\]

Read literally, this result tells is that the evidence in favour of the alternative is 0.5 to 1. I find this hard to understand. To me, it makes a lot more sense to turn the equation “upside down”, and report the amount of evidence in favour of the null. In other words, what we calculate is this:

\[\mbox{BF}^\prime = \frac{P(d|h_0)}{P(d|h_1)} = \frac{0.2}{0.1} = 2\]

Og det vi ville rapportert, er en Bayes-faktor på 2:1 i favør av nullhypotesen. Det er mye enklere å forstå, og du kan tolke dette ved hjelp av tabellen over.