ANOVA estimated marginal means

Discuss the jamovi platform, possible improvements, etc.

by ericcfields » Wed Mar 17, 2021 5:43 pm

I have some fake data comparing two age groups (young vs old) with one within-subjects factor (condition: A vs B ). The marginal means reported by the RM ANOVA in jamovi don't match what I expect and I'm trying to understand how they are calculated.

I've attached a jamovi file showing all the analyses described here.

The means are

young A = -0.257
young B = 0.693
old A = 0.729
old B = 1.564

I expect the marginal means for the main effects to be:
young = (-0.257 + 0.693)/2 = 0.218
old = (0.729 + 1.564)/2 = 1.146
A = (-0.257 + 0.729)/2 = 0.236
B = (0.693 + 1.564)/2 = 1.129

And I expect the marginal means for the interaction to match the means in the descriptives as given above.

The reported estimated marginal means report exactly the expected pattern, except that every reported value (across main effects and interactions) is 0.232 less than expected. I don't understand where this contstant of 0.232 is coming from.

If I instead calculated the model as a linear mixed model in the GAMLj module (again, see attached jamovi file), I get the exact same ANOVA results, but the estimated marginal means are now the values I expect.
Attachments
example.omv
(20.3 KiB) Downloaded 32 times
ericcfields
 
Posts: 6
Joined: Tue Jul 16, 2019 8:55 pm

by jonathon » Wed Mar 17, 2021 10:06 pm

hi,

the usual cause and explanation for this is:

means from descriptives are completely unconstrained, where as estimated marginal means are based on a model, and the assumptions of the model impose constraints on them. this is why they're called *estimated* marginal means, because they're estimated from the model.

russ lenth's introduction to EMMs is probably the best resource i've been able to find:

https://cran.r-project.org/web/packages/emmeans/vignettes/basics.html

there are a number of topics where (i'd argue) the stats community haven't really done a great job of producing materials explaining things -- unfortunately this is one of them.

cheers

jonathon
User avatar
jonathon
 
Posts: 1689
Joined: Fri Jan 27, 2017 10:04 am

by ericcfields » Thu Mar 18, 2021 1:30 pm

Hi Jonathon,

I thought I understood EMMs. I explained how I think they should be calculated (which gives the same means you would get from descriptives in some cases, but not others—e.g., not for the main effects of condition in my example).

I understand that EMMs are model predictions. But as I noted in my original post, my calculations match the EMMs you get from an equivalent mixed model. I would think the RM ANOVA and mixed model should make the same predictions for the cell means.

I've read documentation for the emmeans module (including the part you link) and have used it myself in R. I've played around with this data looking at it in different ways. I just can't figure out where these numbers are coming from.

It's possible I'm missing or misunderstanding something, but can you explain how the EMMs for the RM ANOVA in my example are calculated?
ericcfields
 
Posts: 6
Joined: Tue Jul 16, 2019 8:55 pm

by Ravi » Thu Mar 18, 2021 8:14 pm

Hi Eric,

The estimated marginal means are calculated in the following way for your data:
Screenshot 2021-03-18 at 21.10.58.png
Screenshot 2021-03-18 at 21.10.58.png (92.29 KiB) Viewed 642 times


Why it doesn't give the result you'd expect I'm not sure, but if you want to know it's probably better to contact either the author of the afex package or the author of the emmeans package? I added this Rj analysis to your file if you want to have a look yourself:
example.omv
(21.36 KiB) Downloaded 33 times


Cheers,
Ravi
User avatar
Ravi
 
Posts: 162
Joined: Sat Jan 28, 2017 11:18 am

by ericcfields » Thu Mar 18, 2021 9:38 pm

Hi Ravi and Jonathon,

I have been working on this myself in R. As you show, you get the same result using afex and emmeans in R, so it's not specific to jamovi. However, I think the jamovi developers might want to consider making a change here, because I'm not sure what is being reported is useful or makes sense.

My R script is below. If you run it in R, what you see is that calling emmeans on the afex model produces a warning:

Warning: EMMs are biased unless design is perfectly balanced


Running emmeans on the mixed model output, which produces what I think are the correct marginal means, does not give that error. So currently jamovi is reporting marginal means that emmeans itself calls "biased", but jamovi users are not seeing this warning.

I investigated a little further and figured out what emmeans is reporting here.

If we take the mean of the four cell means, we get 0.682. This is also the intercept from the mixed model (with deviation coded contrasts). This is the unbiased prediction for the overall mean for a sample that had equal numbers of young and old participants.

The mean of all observations in the DV in this sample is 0.450. It is different than the grand mean calculated above because there are twice as many young participants as old. In other words, this unweighted mean of all observations is biased toward the value for young participants.

The difference between the marginal means you get from the ANOVA and the mixed model is that all marginal means (main effects and interactions) are 0.232 less when calculated from the ANOVA. This is just 0.682 – 0.450.

In other words, both models produce the exact same pattern of marginal means (i.e., all differences between cells are the same), but the marginal means from the ANOVA are shifted so that the average of the cell means is the unweighted average of the DV—and therefore biased toward the expectation for young participants.

But it’s not just that the grand mean is biased toward the young participants. The way this gets applied to the cell means results in all the marginal means, for both young and old, being biased too low.

So these marginal means are *not* the model prediction for each cell. Both models (afex ANOVA and lmer) make the same predictions for the cell means, and these predictions are what are reported as marginal means from the mixed model. The marginal means reported for the ANOVA here are also not unbiased estimates of any useful population value. Perhaps there is some situation in which these emms would be useful, but I can’t think of what it is.

Unless there is something I’m missing here, I really don't think this is what jamovi should be reporting.

R code (the .csv file loaded in the code is attached):

Code: Select all
library(psych)
library(afex)
library(lme4)
library(emmeans)

#Default to deviation contrasts
options(contrasts = rep("contr.sum", 2))

#Get data and make categorical variables factors
data <- read.csv("example.csv")
data$group <- factor(data$group)
data$cond <- factor(data$cond)

#Calculate afex ANOVA and report ANOVA and emmeans
aov.m <- aov_car(Y ~ cond*group + Error(sub_id/cond), data=data)
print(aov.m)
print(emmeans(aov.m, ~group*cond))

#Calculate lme4 lmer and report ANOVA and emmeans
lmer.m <- lmer(Y ~ cond*group + (1|sub_id), data=data)
print(anova(lmer.m, type=3))
print(emmeans(lmer.m, ~group*cond))

#Get the afex model prediction for a Young-A observation
#and the EMM for Young-A
#Then find the difference in these
ya_pred <- aov.m$lm$fitted.values[1, "A"]
ya_emm <- summary(emmeans(aov.m, ~group*cond))[2, "emmean"]
diff <- ya_pred - ya_emm

#Get the difference between the intercept
#(the expected mean in the DV if group ns were equal)
#and the mean of all observations
wm_diff <- summary(lmer.m)$coefficients["(Intercept)", "Estimate"] - mean(data$Y)

#These two values are the same (within floating point error)
(diff - wm_diff) < 1e-12
Attachments
example.csv
(950 Bytes) Downloaded 46 times
ericcfields
 
Posts: 6
Joined: Tue Jul 16, 2019 8:55 pm

by jonathon » Fri Mar 19, 2021 12:32 am

i think we don't really feel qualified to weigh in here. there have been a lot of careful decisions made at the level of afex and emmeans, and we weren't part of that process, so we're not really in a position to comment. further, we can't really change afex or emmeans anyway.

i'd suggest taking your query to henrik of afex. if it turns out there's an issue, henrik addresses it, then a whole host of projects building on top of it will benefit -- including jamovi.

cheers

jonathon
User avatar
jonathon
 
Posts: 1689
Joined: Fri Jan 27, 2017 10:04 am

by Ravi » Fri Mar 19, 2021 7:51 am

Hi Eric,

I think the problem is just that a repeated measures anova does not take into account unbalanced data, while a mixed model can. In the end, the estimated marginal means are based on your model and if the model does not take into account the unbalanced data, the estimated marginal means will also not do so. So maybe a mixed model is just more appropriate in this case. Here's a little blurb from the emmeans documentation:
Screenshot 2021-03-19 at 08.41.31.png
Screenshot 2021-03-19 at 08.41.31.png (75.93 KiB) Viewed 619 times


Having said that, we're not experts on the RM ANOVA, so if you'd like to know more of the specifics it's probably best to contact Henrik Singmann (author of the afex package).

Cheers,
Ravi
User avatar
Ravi
 
Posts: 162
Joined: Sat Jan 28, 2017 11:18 am

by ericcfields » Fri Mar 19, 2021 4:04 pm

Hi Ravi,

RM ANOVA can handle unbalanced data fine. The ANOVA and mixed models here give the same results and make the same predictions. Since marginal means are model predictions, they should produce the same marginal means.

The text you screenshot describes what marginal means should be doing. What my example and code shows is that this does not describe what jamovi is reporting for unbalanced ANOVA models.

The issue here is with implementation. The afex/emmeans developers are transparent about this: a warning tells us the marginal means jamovi reports are biased.

I emailed Henrik Singmann, and it turns out the fix is very simple. Using the example in the code I posted, instead of:

Code: Select all
emmeans(aov.m, ~group*cond)


The call should be:

Code: Select all
emmeans(aov.m, ~group*cond, model="multivariate")


This gets rid of the warning and produces the correct marginal means.
ericcfields
 
Posts: 6
Joined: Tue Jul 16, 2019 8:55 pm

by Ravi » Sat Mar 20, 2021 3:08 pm

Hi Eric,

This is really useful information! I wasn't aware of this option and I see that Hendrik now recommends using emmeans this way for the rm afex models, so I'll create a fix for the rm anova in jmv. Thanks for figuring out how to fix this! As I said, in the end we are not experts on all the analyses that are available in jamovi and we lean on the implementations of analyses in R. We are not always aware of new functionality being added to the libraries we use, so it's very useful if someone makes us aware of this.

Cheers,
Ravi
User avatar
Ravi
 
Posts: 162
Joined: Sat Jan 28, 2017 11:18 am

by jonathon » Sun Mar 21, 2021 1:10 am

yup. i'd like to add that we do get a lot of people telling us "the results jamovi produces are wrong!", and 99.9% of the time that's not the case, they haven't understood something, and it's hard for us to escape the feeling that we're having our time wasted. so i do apologise if it sounded like we were reluctant to acknowledge an issue.

with thanks

jonathon
User avatar
jonathon
 
Posts: 1689
Joined: Fri Jan 27, 2017 10:04 am


Return to General