I have identified a potential error in the paired t-test; the effect size is calculated incorrectly. If you tick the effect-size box, you obtain a value that is too high because the standard deviation is not pooled as I have calculated it. To calculate a correct effect-size, you must use the esci module, which is a shame. Can this be fixed?
/Ulf
Incorrect effect-size value in paired t-test
-
- Posts: 1
- Joined: Wed Apr 02, 2025 10:51 am
Re: Incorrect effect-size value in paired t-test
hi,
this is the calculation used:
pooledSD <- tryNaN(stats::sd(column1-column2))
sediff <- pooledSD/sqrt(n)
d <- (m1-m2)/pooledSD
https://github.com/jamovi/jmv/blob/mast ... .R#L66-L68
what would you suggest it should be?
kind regards
this is the calculation used:
pooledSD <- tryNaN(stats::sd(column1-column2))
sediff <- pooledSD/sqrt(n)
d <- (m1-m2)/pooledSD
https://github.com/jamovi/jmv/blob/mast ... .R#L66-L68
what would you suggest it should be?
kind regards
Re: Incorrect effect-size value in paired t-test
Hi. I think something may be wrong there. A paired t test is equivalent to a one-sample t test on difference scores. Therefore there is only one SD describing the sample of difference scores: There aren't multiple sample-SDs that would need to be pooled. So the concept of pooling shouldn't be applicable here.
Re: Incorrect effect-size value in paired t-test
difference <- column1 - column2
# sd of difference
sd_diff <- sd(difference, na.rm = TRUE)
# standard error of difference
sediff <- sd_diff / sqrt(length(na.omit(difference)))
# Cohen's d
m1 <- mean(column1, na.rm = TRUE)
m2 <- mean(column2, na.rm = TRUE)
d <- (m1 - m2) / sd_diff
In short,
For paired (dependent) samples t-tests, you should use the standard deviation of the differences (not the pooled SD). Cohen’s d should also be calculated using the standard deviation of the differences.
# sd of difference
sd_diff <- sd(difference, na.rm = TRUE)
# standard error of difference
sediff <- sd_diff / sqrt(length(na.omit(difference)))
# Cohen's d
m1 <- mean(column1, na.rm = TRUE)
m2 <- mean(column2, na.rm = TRUE)
d <- (m1 - m2) / sd_diff
In short,
For paired (dependent) samples t-tests, you should use the standard deviation of the differences (not the pooled SD). Cohen’s d should also be calculated using the standard deviation of the differences.
Re: Incorrect effect-size value in paired t-test
In checking some examples in jamovi, it appears that the effect size for the paired samples t test is being computed correctly based on the standard deviation of the difference scores (and nothing to do with pooling). The esci module appears to be doing something else--not calculating the effect size for difference scores--which leads to a different result. I'm not familiar with what esci is trying to do.
Re: Incorrect effect-size value in paired t-test
Here is the correct way to calculate Cohen’s d for a paired samples t-test in R
diffs <- column1 - column2
d <- mean(diffs, na.rm=TRUE) / sd(diffs, na.rm=TRUE)
If there are missing values (NAs) in different places in each column, then length(column1) and length(column1 - column2) may not be equal. As a result, mean(column1) - mean(column2) will not match mean(column1 - column2), which leads to an incorrect effect size. Always use the vector of differences for both the numerator and denominator.
diffs <- column1 - column2
d <- mean(diffs, na.rm=TRUE) / sd(diffs, na.rm=TRUE)
If there are missing values (NAs) in different places in each column, then length(column1) and length(column1 - column2) may not be equal. As a result, mean(column1) - mean(column2) will not match mean(column1 - column2), which leads to an incorrect effect size. Always use the vector of differences for both the numerator and denominator.
Re: Incorrect effect-size value in paired t-test
Example:
>column1 <- c(10, 20, 30, 40, 50, NA)
> column2 <- c(8, 17, 28, 39, NA, 60)
> # Incorrect calculation
> m1 <- mean(column1, na.rm=TRUE)
> m2 <- mean(column2, na.rm=TRUE)
> pooledSD <- sd(column1 - column2, na.rm=TRUE)
> d_wrong <- (m1 - m2) / pooledSD
> d_wrong
[1] -0.4898979
> # Correct calculation
> diffs <- column1 - column2
> d_right <- mean(diffs, na.rm=TRUE) / sd(diffs, na.rm=TRUE)
> d_right
[1] 2.44949
>column1 <- c(10, 20, 30, 40, 50, NA)
> column2 <- c(8, 17, 28, 39, NA, 60)
> # Incorrect calculation
> m1 <- mean(column1, na.rm=TRUE)
> m2 <- mean(column2, na.rm=TRUE)
> pooledSD <- sd(column1 - column2, na.rm=TRUE)
> d_wrong <- (m1 - m2) / pooledSD
> d_wrong
[1] -0.4898979
> # Correct calculation
> diffs <- column1 - column2
> d_right <- mean(diffs, na.rm=TRUE) / sd(diffs, na.rm=TRUE)
> d_right
[1] 2.44949