I have identified a potential error in the paired t-test; the effect size is calculated incorrectly. If you tick the effect-size box, you obtain a value that is too high because the standard deviation is not pooled as I have calculated it. To calculate a correct effect-size, you must use the esci module, which is a shame. Can this be fixed?
/Ulf
Incorrect effect-size value in paired t-test
-
- Posts: 1
- Joined: Wed Apr 02, 2025 10:51 am
Re: Incorrect effect-size value in paired t-test
hi,
this is the calculation used:
pooledSD <- tryNaN(stats::sd(column1-column2))
sediff <- pooledSD/sqrt(n)
d <- (m1-m2)/pooledSD
https://github.com/jamovi/jmv/blob/mast ... .R#L66-L68
what would you suggest it should be?
kind regards
this is the calculation used:
pooledSD <- tryNaN(stats::sd(column1-column2))
sediff <- pooledSD/sqrt(n)
d <- (m1-m2)/pooledSD
https://github.com/jamovi/jmv/blob/mast ... .R#L66-L68
what would you suggest it should be?
kind regards
Re: Incorrect effect-size value in paired t-test
Hi. I think something may be wrong there. A paired t test is equivalent to a one-sample t test on difference scores. Therefore there is only one SD describing the sample of difference scores: There aren't multiple sample-SDs that would need to be pooled. So the concept of pooling shouldn't be applicable here.
Re: Incorrect effect-size value in paired t-test
difference <- column1 - column2
# sd of difference
sd_diff <- sd(difference, na.rm = TRUE)
# standard error of difference
sediff <- sd_diff / sqrt(length(na.omit(difference)))
# Cohen's d
m1 <- mean(column1, na.rm = TRUE)
m2 <- mean(column2, na.rm = TRUE)
d <- (m1 - m2) / sd_diff
In short,
For paired (dependent) samples t-tests, you should use the standard deviation of the differences (not the pooled SD). Cohen’s d should also be calculated using the standard deviation of the differences.
# sd of difference
sd_diff <- sd(difference, na.rm = TRUE)
# standard error of difference
sediff <- sd_diff / sqrt(length(na.omit(difference)))
# Cohen's d
m1 <- mean(column1, na.rm = TRUE)
m2 <- mean(column2, na.rm = TRUE)
d <- (m1 - m2) / sd_diff
In short,
For paired (dependent) samples t-tests, you should use the standard deviation of the differences (not the pooled SD). Cohen’s d should also be calculated using the standard deviation of the differences.
Re: Incorrect effect-size value in paired t-test
In checking some examples in jamovi, it appears that the effect size for the paired samples t test is being computed correctly based on the standard deviation of the difference scores (and nothing to do with pooling). The esci module appears to be doing something else--not calculating the effect size for difference scores--which leads to a different result. I'm not familiar with what esci is trying to do.
Re: Incorrect effect-size value in paired t-test
Here is the correct way to calculate Cohen’s d for a paired samples t-test in R
diffs <- column1 - column2
d <- mean(diffs, na.rm=TRUE) / sd(diffs, na.rm=TRUE)
If there are missing values (NAs) in different places in each column, then length(column1) and length(column1 - column2) may not be equal. As a result, mean(column1) - mean(column2) will not match mean(column1 - column2), which leads to an incorrect effect size. Always use the vector of differences for both the numerator and denominator.
diffs <- column1 - column2
d <- mean(diffs, na.rm=TRUE) / sd(diffs, na.rm=TRUE)
If there are missing values (NAs) in different places in each column, then length(column1) and length(column1 - column2) may not be equal. As a result, mean(column1) - mean(column2) will not match mean(column1 - column2), which leads to an incorrect effect size. Always use the vector of differences for both the numerator and denominator.
Re: Incorrect effect-size value in paired t-test
Example:
>column1 <- c(10, 20, 30, 40, 50, NA)
> column2 <- c(8, 17, 28, 39, NA, 60)
> # Incorrect calculation
> m1 <- mean(column1, na.rm=TRUE)
> m2 <- mean(column2, na.rm=TRUE)
> pooledSD <- sd(column1 - column2, na.rm=TRUE)
> d_wrong <- (m1 - m2) / pooledSD
> d_wrong
[1] -0.4898979
> # Correct calculation
> diffs <- column1 - column2
> d_right <- mean(diffs, na.rm=TRUE) / sd(diffs, na.rm=TRUE)
> d_right
[1] 2.44949
>column1 <- c(10, 20, 30, 40, 50, NA)
> column2 <- c(8, 17, 28, 39, NA, 60)
> # Incorrect calculation
> m1 <- mean(column1, na.rm=TRUE)
> m2 <- mean(column2, na.rm=TRUE)
> pooledSD <- sd(column1 - column2, na.rm=TRUE)
> d_wrong <- (m1 - m2) / pooledSD
> d_wrong
[1] -0.4898979
> # Correct calculation
> diffs <- column1 - column2
> d_right <- mean(diffs, na.rm=TRUE) / sd(diffs, na.rm=TRUE)
> d_right
[1] 2.44949
Re: Incorrect effect-size value in paired t-test
So I looked into the calculation of the cohen's d in the paired t-test and I think it's being calculated correctly at the moment.
If you look at https://github.com/jamovi/jmv/blob/aa1f ... .R#L16-L17 and https://github.com/jamovi/jmv/blob/aa1f ... .R#L45-L47 you see that the missing data is already excluded before calculating all the statistics; only cases that have a score for both column1 and column2 are included. Therefore, there's no need to exclude missing values a second time when calculating the scores.
I am still curious about the discrepancy Ulf mentioned though, so if you could elaborate on that, that would be great.
If you look at https://github.com/jamovi/jmv/blob/aa1f ... .R#L16-L17 and https://github.com/jamovi/jmv/blob/aa1f ... .R#L45-L47 you see that the missing data is already excluded before calculating all the statistics; only cases that have a score for both column1 and column2 are included. Therefore, there's no need to exclude missing values a second time when calculating the scores.
I am still curious about the discrepancy Ulf mentioned though, so if you could elaborate on that, that would be great.
Re: Incorrect effect-size value in paired t-test
@Ravi The esci module appears to be calculating an effect size (and CI) for the difference between two means computed from paired samples, somehow taking the pairing into account, but NOT simply calculating the effect size (and CI) for difference scores.