Forfatter av avsnitt: Danielle J. Navarro and David R. Foxcroft
Multippel lineær regresjon
The simple linear regression model that we have discussed up to this point
assumes that there is a single predictor variable that you are interested in,
in this case dani.sleep. In fact, up to this point every statistical tool
that we have talked about has assumed that your analysis uses one predictor
variable and one outcome variable. However, in many (perhaps most) research
projects you actually have multiple predictors that you want to examine. If so,
it would be nice to be able to extend the linear regression framework to be
able to include multiple predictors. Perhaps some kind of multiple
regression model would be in order?
Multiple regression is conceptually very simple. All we do is add more terms to
our regression equation. Let us suppose that we have got two variables that we
are interested in; perhaps we want to use both dani.sleep and
baby.sleep to predict the dani.grump variable. As before, we let
Yi refer to my grumpiness on the i-th day. But now we have two X
variables: the first corresponding to the amount of sleep I got and the second
corresponding to the amount of sleep my son got. So we will let Xi1
refer to the hours I slept on the i-th day and Xi2 refers to the
hours that the baby slept on that day. If so, then we can write our regression
model like this:
Som tidligere er εi residuumet knyttet til den i-te observasjonen, \({\epsilon}_i = {Y}_i - \hat{Y}_i\). I denne modellen har vi nå tre koeffisienter som må estimeres: b0 er skjæringspunktet (intercept), b1 er koeffisienten som er knyttet til min søvn, og b2 er koeffisienten som er knyttet til min sønns søvn. Selv om antallet koeffisienter som må estimeres, har endret seg, er den grunnleggende ideen om hvordan estimeringen fungerer uendret: De estimerte koeffisientene \(\hat{b}_0\), \(\hat{b}_1\) og \(\hat{b}_2\) er de koeffisientene som minimerer summen av de kvadrerte residuene.
Gjør det i jamovi
Multiple regression in jamovi is no different to simple regression. All we have
to do is add additional variables to the Covariates box in jamovi. For
example, if we want to use both dani.sleep and baby.sleep as predictors
in our attempt to explain why I am so grumpy, then move baby.sleep across
into the Covariates box alongside dani.sleep. By default, jamovi
assumes that the model should include an intercept. The coefficients we get
this time are:
Prediktor |
Estimat |
|---|---|
Skjæringspunkt |
125.966 |
|
-8.950 |
|
0.011 |
The coefficient associated with dani.sleep is quite large, suggesting that
every hour of sleep I lose makes me a lot grumpier. However, the coefficient
for baby.sleep is very small, suggesting that it does not really matter how
much sleep my son gets. What matters as far as my grumpiness goes is how much
sleep I get. To get a sense of what this multiple regression model looks
like, Fig. 137 shows a 3D plot that plots all three variables, along
with the regression model itself.
Fig. 137 3D visualisation of a multiple regression model: There are two predictors in
the model, dani.sleep and baby.sleep and the outcome variable is
dani.grump. Together, these three variables form a 3D space. Each
observation (dot) is a point in this space. In much the same way that a
simple linear regression model forms a line in 2D space, this multiple
regression model forms a plane in 3D space. When we estimate the regression
coefficients what we are trying to do is find a plane that is as close to
all the blue dots as possible.
Formel for det generelle tilfellet
Ligningen jeg viste ovenfor, viser hvordan en multippel regresjonsmodell ser ut når du inkluderer to prediktorer. Ikke overraskende er det slik at hvis du vil ha flere enn to prediktorer, er alt du trenger å gjøre å legge til flere X-termer og flere b-koeffisienter. Med andre ord, hvis du har K prediktorvariabler i modellen, ser regresjonsligningen slik ut: