Inconsistent b0 in linear regression between Jamovi and R

Discuss statistics related things

by joseluisblues » Wed Mar 13, 2019 9:29 pm

Hi,

I was comparing Jamovi with R and I find a striking difference of the intercept when I do a linear model,
Say I have:

"Picture",30
"Picture",35
"Picture",45
"Picture",40
"Picture",50
"Picture",35
"Picture",55
"Picture",25
"Picture",30
"Picture",45
"Picture",40
"Picture",50
"Real Spider",40
"Real Spider",35
"Real Spider",50
"Real Spider",55
"Real Spider",65
"Real Spider",55
"Real Spider",50
"Real Spider",35
"Real Spider",30
"Real Spider",50
"Real Spider",60
"Real Spider",39

Jamovi is giving me:

Predictor Estimate SE t p
Intercept 43.50 2.08 20.90 < .001
Group:
Real Spider – Picture 7.00 4.16 1.68 0.107

But R,
With:

m1 <- lm(Anxiety ~ Group, data=spiderLong)
summary(m1)

R version 3.5.0 (2018-04-23) -- "Joy in Playing"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

> m1 <- lm(Anxiety ~ Group, data=spiderLong)
> summary(m1)

Call:
lm(formula = Anxiety ~ Group, data = spiderLong)

Residuals:
Min 1Q Median 3Q Max
-17.0 -8.5 1.5 8.0 18.0

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 40.000 2.944 13.587 3.53e-12 ***
GroupReal Spider 7.000 4.163 1.681 0.107
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 10.2 on 22 degrees of freedom
Multiple R-squared: 0.1139, Adjusted R-squared: 0.07359
F-statistic: 2.827 on 1 and 22 DF, p-value: 0.1068

Why the intercept is giving 40 in one case and 43.5 in the other??

Thanks for any hint!

José
joseluisblues
 
Posts: 2
Joined: Wed Mar 13, 2019 8:30 pm

by jonathon » Wed Mar 13, 2019 11:40 pm

hi,

it's possible there are some options different between your local R, and jamovi.

`options('contrasts')` seems like a likely candidate.

let me know what you find.

cheers

jonathon
User avatar
jonathon
 
Posts: 933
Joined: Fri Jan 27, 2017 10:04 am

by MAgojam » Thu Mar 14, 2019 11:35 pm

Hi, @jonathon.
I tried the example of @joseluisblues, with regress in Stata v15.1 which replicates the same results as lm in R.
It seems that in jamovi, jmv::linReg returns the intercept of the null model (without covariates), ignoring DV factor.
I am attaching a screenshot of the Stata output.
ScreenShot_LR.png
ScreenShot_LR.png (170.32 KiB) Viewed 1110 times

Cheers.
Maurizio
Last edited by MAgojam on Sat Mar 16, 2019 12:23 pm, edited 1 time in total.
User avatar
MAgojam
 
Posts: 62
Joined: Thu Jun 08, 2017 2:33 pm
Location: Parma (Italy)

by jonathon » Fri Mar 15, 2019 1:24 am

oh righto. i'll attend ravi to this thread.

cheers

jonathon
User avatar
jonathon
 
Posts: 933
Joined: Fri Jan 27, 2017 10:04 am

by joseluisblues » Sun Mar 17, 2019 9:09 pm

Hey, great, thanks to both of you,
So, this a bug in the function that should be fixed, or there is parameter that I can change?

cheers,
joseluisblues
 
Posts: 2
Joined: Wed Mar 13, 2019 8:30 pm

by Ravi » Sun Mar 17, 2019 9:17 pm

So I checked what's going on here, and it has to do with how we set up contrasts (so no bug). At the moment we set up contrasts in such a way that the intercept represents the grand mean. Not sure anymore why we did it this way though, have to look into this a bit more before deciding whether we should change it.
User avatar
Ravi
 
Posts: 140
Joined: Sat Jan 28, 2017 11:18 am

by jRafi » Tue Oct 08, 2019 2:35 pm

Have you decided yet whether to keep the grand mean as the intercept?

In my opinion, having the grand mean as the intercept is confusing. It is not consistent with other software or the notion that the intercept should reflect the reference level.
jRafi
 
Posts: 2
Joined: Fri Feb 15, 2019 9:08 pm
Location: Stockholm

by jonathon » Thu Oct 10, 2019 11:22 pm

we're still looking into it.

jonathon
User avatar
jonathon
 
Posts: 933
Joined: Fri Jan 27, 2017 10:04 am

by nadia » Fri Oct 11, 2019 7:09 am

Hi,
I'm agree with jRafi. It's very confusing and it's very difficult to explain to students why it's not the good estimation.
I'm happy you look into it!
Nadia:)
nadia
 
Posts: 1
Joined: Fri Oct 11, 2019 7:01 am

by jonathon » Fri Oct 11, 2019 9:59 am

so i've experimented with using "dummy coding", but it leads to different (and unexpected) sums-of-squares if the model has an interaction in it. i asked marcello about this, and he writes:

for GAMLj I decided to set, as the default, the centered coding so the intercept is the grand mean, with the option for the user to change the coding (the same for continuous independent variables that are centered to their means). This makes the SS and the parameters estimates consistent.

Using the "dummy" coding alters the SS because you have interactions in the model, so also the main effects are changed (they are no longer main effect, but simple effects estimated for the other variable reference group). SS of squares are not wrong, are simply the variances explained by the effects computed for the other variable reference group. I know that they are not what users expect. Thus, If you go for the "dummy" coding, the only reasonable way is to estimate the model twice, one with centered coding to obtain the "usual" SS, and one for the parameters with the dummy coding. This is exactly what SPSS does. In my experience with teaching on SPSS, however, this is very, very confusing for students and users. It is also a bit deceptive, because the user may be led to believe that he ran one model, but the software actually ran two different models. Furthermore, if the model contains interactions, all the lower order parameters will be different, yielding very "unexpected" results (from the user prospective). I would suggest not to go in that direction. If a user needs the parameters estimated for the reference group, the user can simply choose "dummy" in the factor coding option.
User avatar
jonathon
 
Posts: 933
Joined: Fri Jan 27, 2017 10:04 am


Return to Statistics