Garlaban wrote:I need to dig deeper in the literature!
Sure Clemente,
this is very good for "Statistical Thinking" and enriches the mind.
The method of "omitting the ties of observations" was originally a recommendation in the context of the test of signs for a sample of data, where in that context it proved to be the most powerful single test, and then moved on to other contexts as well.
Your research is likely to lead you to verify that the literature does not contain an explicit recommendation to omit ties observations for WMW.
At the end of this post, I think it can be summed up in a synthetic way, that the exclusion of the ties reduces the size of the sample, resulting in a loss of power. The loss is not too great if there are "not so many" ties. An advantage of the method is that it reduces the bias towards rejection of the null hypothesis.
Personally I also take a look with a permutation test.
I will refer to the variables of the file "Rj_Wilcoxon_ties.omv" attached earlier.
A permutation test is based on the idea that if there is no shift of the values from VAS_Pain_Before to VAS_Pain_After, we could change the signs of the differences without harm. If we randomly change these signs many times and compute the t-statistic for each of these "permuted" samples, we can approximate the "permutation distribution" of the t-statistic and use that distribution to get a reliable P-value.
There are too many possible permutations to consider them all by combinatorial methods, but the simulation of many cases gives a useful result. The result will be slightly different with each execution, but with eg. 100,000 iterations not different enough to influence the conclusion if we reject the null hypothesis.
To get more involved with R, copy (if you want) these few lines of code to the end of those already in the file or even open a new editor and copy them there (run the code) and a little patience.
Code: Select all
vasdif = data$VAS_Bef_Aft_Diff
n = length(vasdif)
set.seed(1953)
ni = 10^5 # number of iterations
tobs = t.test(vasdif)$stat
pobs = t.test(vasdif)$p.val
tprm = replicate(ni, t.test(vasdif * sample(c(-1,1), n, rep = T))$stat)
pval = mean(abs(tprm) >= abs(tobs))
# do not be in a hurry, the p arrives
cat("\nAfter", ni, "iterations\n", "p =", pval)
Now I'm done breaking up ...
Cheers,
Maurizio