Autor des Abschnitts: Danielle J. Navarro and David R. Foxcroft

Bayessche Hypothesentests

In chapter Das Überprüfen von Hypothesen, I described the orthodox approach to hypothesis testing. It took an entire chapter to describe, because null hypothesis testing is a very elaborate contraption that people find very hard to make sense of. In contrast, the Bayesian approach to hypothesis testing is incredibly simple. Let us pick a setting that is closely analogous to the orthodox scenario. There are two hypotheses that we want to compare, a null hypothesis h0 and an alternative hypothesis h1. Prior to running the experiment we have some beliefs P(h) about which hypotheses are true. We run an experiment and obtain data d. Unlike frequentist statistics, Bayesian statistics does allow us to talk about the probability that the null hypothesis is true. Better yet, it allows us to calculate the posterior probability of the null hypothesis, using Bayes’ rule:

\[P(h_0 | d) = \frac{P(d|h_0) P(h_0)}{P(d)}\]

Diese Formel sagt uns genau, wie viel Vertrauen wir in die Nullhypothese haben können, nachdem wir die Daten d beobachtet haben. In ähnlicher Weise können wir berechnen, wie viel Vertrauen wir in die Alternativhypothese setzen sollten, indem wir fast die gleiche Gleichung verwenden. Wir müssen nur den tiefgestellten Index ändern:

\[P(h_1 | d) = \frac{P(d|h_1) P(h_1)}{P(d)}\]

It is all so simple that I feel like an idiot even bothering to write these equations down, since all I am doing is copying Bayes rule from the previous section.[1]

Der Bayes-Faktor

In practice, most Bayesian data analysts tend not to talk in terms of the raw posterior probabilities P(h0|*d*) and P(h1|*d*). Instead, we tend to talk in terms of the posterior odds ratio. Think of it like betting. Suppose, for instance, the posterior probability of the null hypothesis is 25%, and the posterior probability of the alternative is 75%. The alternative hypothesis is three times as probable as the null, so we say that the odds are 3:1 in favour of the alternative. Mathematically, all we have to do to calculate the posterior odds is divide one posterior probability by the other:

\[\frac{P(h_1 | d)}{P(h_0 | d)} = \frac{0.75}{0.25} = 3\]

Or, to write the same thing in terms of the equations above:

\[\frac{P(h_1 | d)}{P(h_0 | d)} = \frac{P(d|h_1)}{P(d|h_0)} \times \frac{P(h_1)}{P(h_0)}\]

Actually, this equation is worth expanding on. There are three different terms here that you should know. On the left-hand side, we have the posterior odds, which tells you what you believe about the relative plausibilty of the null hypothesis and the alternative hypothesis after seeing the data. On the right-hand side, we have the prior odds, which indicates what you thought before seeing the data. In the middle, we have the Bayes factor, which describes the amount of evidence provided by the data:

\[\begin{split}\begin{array}{ccccc}\displaystyle \frac{P(h_1 | d)}{P(h_0 | d)} & = & \displaystyle\frac{P(d|h_1)}{P(d|h_0)} & \times & \displaystyle\frac{P(h_1)}{P(h_0)} \\[6pt] \\[-2pt] \uparrow & ~ & \uparrow & ~ & \uparrow \\[6pt] \mbox{a-posteriori-Wahrscheinlichkeit} & ~ & \mbox{Bayes-Faktor} & ~ & \mbox{a-priori-Wahrscheinlichkeit} \\ \end{array}\end{split}\]

Der Bayes-Faktor (oft als BF abgekürzt) nimmt im Bayesschen Hypothesentest eine besondere Stellung ein, da er eine ähnliche Funktion erfüllt wie der p-Wert in frequentistischen Hypothesentests. Der Bayes-Faktor quantifiziert die Stärke der Evidenz, die von den Daten geliefert wird. Daher ist der Bayes-Faktor, das Resultat, das man gewöhnlich angibt, wenn man einen Bayes-Hypothesentest durchführt. Der Grund für die Angabe von Bayes-Faktoren anstelle von a-posteriori-Wahrscheinlichkeiten ist, dass verschiedene Forscher unterschiedliche Prioritäten haben. Manche Leute glauben eher, dass die Nullhypothese wahr ist, andere glauben eher, dass sie falsch ist. Aus diesem Grund ist es für einen angewandten Forscher besser, den Bayes-Faktor anzugeben. Auf diese Weise kann jeder, der die Arbeit liest, den Bayes-Faktor mit seiner eigenen persönlichen a-priori-Wahrscheinlichkeit multiplizieren und selbst ausrechnen, wie die a-posteriori-Wahrscheinlichkeit aussehen würde. In jedem Fall tun wir gerne so, als würden wir die Nullhypothese und die Alternative gleichermaßen berücksichtigen. In diesem Fall ist die a-priori-Wahrscheinlichkeit gleich 1, und die a-posteriori-Wahrscheinlichkeit entsprechen dem Bayes-Faktor.

Interpretieren von Bayes-Faktoren

One of the really nice things about the Bayes factor is the numbers are inherently meaningful. If you run an experiment and you compute a Bayes factor of 4, it means that the evidence provided by your data corresponds to betting odds of 4:1 in favour of the alternative. However, there have been some attempts to quantify the standards of evidence that would be considered meaningful in a scientific context. The two most widely used are from Jeffreys (1961) and Kass and Raftery (1995). Of the two, I tend to prefer the Kass and Raftery (1995) table because it is a bit more conservative. So here it is:

Bayes-Faktor

Interpretation

1 – 3

Vernachlässigbare Evidenz

3 – 20

Evidenz

20 – 150

Starke Evidenz

> 150

Sehr starke Evidenz

And to be perfectly honest, I think that even the Kass and Raftery (1995) standards are being a bit charitable. If it were up to me, I would have called the “positive evidence” category “weak evidence”. To me, anything in the range 3:1 to 20:1 is “weak” or “modest” evidence at best. But there are no hard and fast rules here. What counts as strong or weak evidence depends entirely on how conservative you are and upon the standards that your community insists upon before it is willing to label a finding as “true”.

In any case, note that all the numbers listed above make sense if the Bayes factor is greater than 1 (i.e., the evidence favours the alternative hypothesis). However, one big practical advantage of the Bayesian approach relative to the orthodox approach is that it also allows you to quantify evidence for the null. When that happens, the Bayes factor will be less than 1. You can choose to report a Bayes factor less than 1, but to be honest I find it confusing. For example, suppose that the likelihood of the data under the null hypothesis P(d*|*h0) is equal to 0.2, and the corresponding likelihood P(d*|*h1) under the alternative hypothesis is 0.1. Using the equations given above, Bayes factor here would be:

\[\mbox{BF} = \frac{P(d|h_1)}{P(d|h_0)} = \frac{0.1}{0.2} = 0.5\]

Read literally, this result tells is that the evidence in favour of the alternative is 0.5 to 1. I find this hard to understand. To me, it makes a lot more sense to turn the equation “upside down”, and report the amount of evidence in favour of the null. In other words, what we calculate is this:

\[\mbox{BF}^\prime = \frac{P(d|h_0)}{P(d|h_1)} = \frac{0.2}{0.1} = 2\]

Und wir würden einen Bayes-Faktor von 2:1 zugunsten der Nullhypothese ausweisen. Das ist viel einfacher zu verstehen, und Sie können dies anhand der obigen Tabelle interpretieren.