Post-hoc test after a Khi² test of association

Discuss statistics related things

by RHainez » Thu May 18, 2017 7:09 am

Hi,

A khi² test of association can inform us about the presence/absence of a significant difference between groups.

Is it possible to add a post-hoc test to identify which group is different from the others (when the size of the table is > 2x2)? Maybe a pairwise.prop.test() or something like that?

Have a nice day.

Romaric
RHainez
 
Posts: 57
Joined: Wed Feb 08, 2017 3:14 pm

by jonathon » Mon May 22, 2017 9:50 am

hi romaric,

this is a good idea. i will try and do this in the next fortnight. you've put in so many good feature requests, and i feel like we haven't got to any of them. anyway, i *will* do this in the next fortnight, so kick up a stink if i don't :)

with thanks
User avatar
jonathon
 
Posts: 873
Joined: Fri Jan 27, 2017 10:04 am

by RHainez » Fri May 26, 2017 12:33 pm

Hi Jonothan,

After a little research, maybe a simpler way to implement a post-hoc test for a khi² test would be to use the existing chisq.test from the stats package which gives the khi² value, its significance level and the Pearsons' residues (Xsq$residuals) and the Habermans' residues (Xsq$stdres) ?

Example.
> M <- as.table(rbind(c(212,29,11,2,3), c(318,61,6,11,13), c(160,39,9,6,12)))
> rownames(M) <- c("a1","a2","a3")
> colnames(M) <- c("b1","b2","b3","b4","b5")
> M
b1 b2 b3 b4 b5
a1 212 29 11 2 3
a2 318 61 6 11 13
a3 160 39 9 6 12
> Xsq <- chisq.test(M)
Message d’avis :
In chisq.test(M) : l’approximation du Chi-2 est peut-^etre incorrecte
> Xsq
Pearson’s Chi-squared test
data: M
X-squared = 20.3583, df = 8, p-value = 0.009062

> Xsq$residuals
b1 b2 b3 b4 b5
a1 0.93616090 -1.33963261 1.28206096 -1.48489514 -1.78406400
a2 0.09113804 0.24066234 -1.71501389 0.77521492 0.04505463
a3 -1.12090876 1.10480426 0.93998072 0.54059524 1.84188135

> Xsq$stdres
b1 b2 b3 b4 b5
a1 2.33159333 -1.71672778 1.54215391 -1.77896194 -2.14848117
a2 0.26026470 0.35362027 -2.36537490 1.06489378 0.06221195
a3 -2.72597782 1.38245439 1.10404745 0.63240142 2.16587067

The khi² value gives 1% so every value of Xsq$stdres which sits outside [-2.33;+2.33] flags a significant difference. So here, the signficant differences are found between (a1, b1), (a2, b3) and (a3, b1).

The example comes from http://www.normalesup.org/~carpenti/Not ... esidus.pdf (written in french, sorry). And the 2.33 limit comes from http://www1.udel.edu/FREC/ilvento/FREC408/normhand

I'm not a khi2 specialist at all, so it's just a suggestion. If a stats wizard would be kind enough to step in, his or her advices would be much appreciated :)

Have a nice day.
RHainez
 
Posts: 57
Joined: Wed Feb 08, 2017 3:14 pm

by jonathon » Fri May 26, 2017 12:56 pm

eek, you're right romaric, there seems to be a lot of different ways to do this.

http://pareonline.net/getvn.asp?v=20&n=8

jonathon
User avatar
jonathon
 
Posts: 873
Joined: Fri Jan 27, 2017 10:04 am

by RHainez » Fri May 26, 2017 3:20 pm

Oh... well, it's less simple than what I thought (i.e. a quick and easy post-hoc test like with an Anova test), my apologies :/
RHainez
 
Posts: 57
Joined: Wed Feb 08, 2017 3:14 pm

by RHainez » Fri Jun 16, 2017 5:20 pm

As a solution, there is also the chisq.post.hoc() function

And after reading the link you provided, it seems that the one solution to avoid the post-hoc analysis following à khi² test "is to replace chisquare testing with log-linear analysis" (p.08).

So, a lead to solve this problem would maybe be to use the glm() function?

It's just a suggestion and I'll let the stats wizards debate over this, but a solution or a module (someone interested? :) ) would be nice so that mere mortals are not stuck with a significant difference (when there are more than two categorical variables) and no way to find its location in the data.

And keep up the godd work :)
RHainez
 
Posts: 57
Joined: Wed Feb 08, 2017 3:14 pm


Return to Statistics

cron