How are ties handled by jamovi in the Wilcoxon test ?

General help and assistance with jamovi. Bug reports can be made at our issues page: https://github.com/jamovi/jamovi/issues . (If you're unsure feel free to discuss it here)

by Garlaban » Sun Aug 14, 2022 10:28 am

Hi jamovi users/team :)

I'm currently trying to understand how jamovi handles ties in the Wilcoxon test.
When running a Wilcoxon test on ordinal data with ties, jamovi displays the following message : "XX pair(s) of values were tied" with "XX" being the number of ties.

I read elsewhere in the forum (https://forum.jamovi.org/viewtopic.php?p=6750) that jamovi uses the wilcox.test() function in R to implement the test.

However, in the function's documentation (https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/wilcox.test), I don't really see how ties are handled.

I know that there are several ways ties are handled wen running a Wilcoxon test: the ties can be discarded, tied rankings can be divided within all same values, it is possible to order the tied observations randomly etc. How does jamovi actually handle them?

edit: In a previous thread (https://github.com/jamovi/jamovi/issues/577) I found out that in case of ties "a normal approximation is used". But I'm still not sure what this means regarding how ties are handled.

Thanks!
Clement
Garlaban
 
Posts: 4
Joined: Sun Aug 14, 2022 10:13 am

by MAgojam » Sun Aug 14, 2022 8:03 pm

Hey @Clement,
might you want to take a look at the code in jamovi?
You can look here:
https://github.com/jamovi/jmv/blob/master/R/ttestps.b.R#L77

So you can see that under jamovi's hood is a call to the wilcox.test() function of the stats package.
It might be interesting to take a look at the sources of this feature.
You can do this by staying in jamovi and freeing the Rj Module Editor (if installed from the jamovi library).
Run this line of code from the Editor:
Code: Select all
stats:::wilcox.test.default


If you chew a little bit of R you should have no problem following a track and find that the ties are discarded and this is reflected in jamovi.

Cheers,
Maurizio
User avatar
MAgojam
 
Posts: 313
Joined: Thu Jun 08, 2017 2:33 pm
Location: Parma (Italy)

by Garlaban » Mon Aug 15, 2022 8:47 am

Hi Maurizio !


Thank you very much for your reply. I'm not familiar with R at all but I tried to get a better sense of what you are saying by installing the Rj Module and running the line you proposed in the editor.

I see "ties" mentioned in the code displayed when running "stats:::wilcox.test.default". I kinda find the two correction for ties proposed here (https://www.real-statistics.com/non-parametric-tests/wilcoxon-signed-ranks-test/) in the code but I don't find the code corresponding to the discarding of the ties. My overall understanding of your reply is that ties are discarded, then an approximation is made with two different formulas depending on the number of ties?

Now that I (maybe) understand better how ties are handled, do you (or someone else) know how their number displayed by jamovi should be handled while interpreting the results. Intuitively I understand that 10 ties on a sample of 100 data points might not be problematic. But what about 100 ties on a 300 data points sample? When working with ordinal data from 3 or 5-point Likert scales, ties are quite likely. I find a lot of discussion/research on corrections for ties but not that much on their impact on the Wilcoxon's test results fiability.

Also, if jamovi excludes ties, shouldn't the message be more explicit (e.g., "XX pair(s) of values were tied and excluded from the analysis")?

edit: I realize that I didn't understand correctly what "ties" were. I thought that they were when two paired values were equal, but according to this (https://www.statstutor.ac.uk/resources/uploaded/wilcoxonsignedranktest.pdf) there are actually two kind of "ties":
- when pairs of value have the same difference between them than other pairs of values
- when pairs of value are equal and thus, their difference is 0
Now I'm also confused about what "ties" in jamovi's warning refers to...

Bests,
Clement
Garlaban
 
Posts: 4
Joined: Sun Aug 14, 2022 10:13 am

by MAgojam » Mon Aug 15, 2022 8:09 pm

Hey @Clement,
even if you report that you are not familiar with R (you have made some interesting remarks anyway), I continue on this route and attach an omv file with two dummy variables with a score of 1-10 for a VAS (Visual Analogic Scale) of pain referred to the lobar spine before and after specific rehabilitation treatment.
There are also some computed variables, such as the difference between before and after, ranks, and some R code for Rj.
Much reduced and simplified code to allow you to better understand what the wilcoxon.test() function does, under the hood of jamovi.

- when pairs of value are equal and thus, their difference is 0
Now I'm also confused about what "ties" in jamovi's warning refers to...

Yes, when the difference of the pair of values is zero, the pair will be excluded and will not have a rank score for the sum.

By default, the wilcox.test() function will calculate exact p values if the samples contain less than 50 finite values and there are no ties in the values, otherwise a normal approximation is used.
When using normal approximation, a correction is applied. A continuity correction is an adjustment that is made when a discrete distribution is approximated by a continuous distribution.
Now, have fun. :slightly_smiling_face:

Cheers,
Maurizio
Rj_Wilcoxon_ties.omv
(7.12 KiB) Downloaded 23 times
User avatar
MAgojam
 
Posts: 313
Joined: Thu Jun 08, 2017 2:33 pm
Location: Parma (Italy)

by Annata20 » Fri Aug 19, 2022 3:37 am

Thanks for your reply
Annata20
 
Posts: 1
Joined: Fri Aug 19, 2022 2:51 am
Location: United States

by Garlaban » Sun Aug 21, 2022 8:26 am

Hi Maurizio,


Thank you very much for your in-depth reply and for providing an example dataset! By looking at the simplified code I was able to get what you mean quite clearly :)

Thank you also for your explanation of the continuity correction!

I'm still curious about the impact of ties on the validity of the test (is there a percentage of ties above which the test looses validity?). I need to dig deeper in the literature!


Bests,
Clement
Garlaban
 
Posts: 4
Joined: Sun Aug 14, 2022 10:13 am

by MAgojam » Sun Aug 21, 2022 1:09 pm

Garlaban wrote:I need to dig deeper in the literature!

Sure Clemente,
this is very good for "Statistical Thinking" and enriches the mind.

The method of "omitting the ties of observations" was originally a recommendation in the context of the test of signs for a sample of data, where in that context it proved to be the most powerful single test, and then moved on to other contexts as well.
Your research is likely to lead you to verify that the literature does not contain an explicit recommendation to omit ties observations for WMW.

At the end of this post, I think it can be summed up in a synthetic way, that the exclusion of the ties reduces the size of the sample, resulting in a loss of power. The loss is not too great if there are "not so many" ties. An advantage of the method is that it reduces the bias towards rejection of the null hypothesis.

Personally I also take a look with a permutation test.
I will refer to the variables of the file "Rj_Wilcoxon_ties.omv" attached earlier.
A permutation test is based on the idea that if there is no shift of the values ​​from VAS_Pain_Before to VAS_Pain_After, we could change the signs of the differences without harm. If we randomly change these signs many times and compute the t-statistic for each of these "permuted" samples, we can approximate the "permutation distribution" of the t-statistic and use that distribution to get a reliable P-value.
There are too many possible permutations to consider them all by combinatorial methods, but the simulation of many cases gives a useful result. The result will be slightly different with each execution, but with eg. 100,000 iterations not different enough to influence the conclusion if we reject the null hypothesis.
To get more involved with R, copy (if you want) these few lines of code to the end of those already in the file or even open a new editor and copy them there (run the code) and a little patience.
Code: Select all
vasdif = data$VAS_Bef_Aft_Diff
n = length(vasdif)
set.seed(1953)
ni = 10^5    # number of iterations

tobs = t.test(vasdif)$stat 
pobs = t.test(vasdif)$p.val

tprm = replicate(ni, t.test(vasdif * sample(c(-1,1), n, rep = T))$stat)
pval = mean(abs(tprm) >= abs(tobs)) 

# do not be in a hurry, the p arrives
cat("\nAfter", ni, "iterations\n", "p =", pval)


Now I'm done breaking up ... :')
Cheers,
Maurizio
User avatar
MAgojam
 
Posts: 313
Joined: Thu Jun 08, 2017 2:33 pm
Location: Parma (Italy)

by Garlaban » Wed Aug 24, 2022 4:33 pm

Hi Maurizio !

Your research is likely to lead you to verify that the literature does not contain an explicit recommendation to omit ties observations for WMW.


That's indeed what happened ^^'

Thank you very much for your suggestion of the "permutation test" and for the corresponding code (which I understood and ran successfully). I had no idea this test existed.

A warm thank you for the time you took to help me! The Wilcoxon test in jamovi is now much clearer for me, and that would not have happened without you :)

Bests,
Clement
Garlaban
 
Posts: 4
Joined: Sun Aug 14, 2022 10:13 am


Return to Help