Page 1 of 1

Removing outliers using filters

Posted: Thu Aug 21, 2025 11:23 am
by Melj1
Hello,

I’m trying to clean my data by removing outliers that are above and below 3SD of the mean. I have two between subject groups for my data (2 X 3 design), so the mean and SD is different for each group and condition. How can I use filters to remove the outliers?

Thank you!

Re: Removing outliers using filters

Posted: Tue Aug 26, 2025 5:09 am
by yurismol
I can suggest using the new function "Univariate outliers identification and removal" in the jYS module.
3*SD is traditional Z-score method.
But you can use modified Z-score for median estimation for non normality assumption.

Re: Removing outliers using filters

Posted: Tue Aug 26, 2025 11:50 pm
by jonathon
here's a video demonstrating the process step-by-step. you don't need to do it in in so many steps, but i've done it this way to make it clearer.

https://youtu.be/bvjaiDAd3HE

step 1, we need a column which a level for each group. because one of the variables was text, i could simply use the + operator.
step 2, we compute a Z score, using that group variable as the group_by
step 3, we use an if-statement to produce a value of 0 when the z-score is less than -2, or more than 2, and 1 otherwise (this is what the filters expect, 1 = good, 0 = filter out).
step 4 seems a bit silly, we have to copy/paste the values into a new column from step 3 ... the reason for this is because otherwise we'd end up with an infinite loop ... the filters would exclude some rows, that would update the Z calculation, which would in turn change the filters, and so on.
step 5 we point the filter at the copy/pasted column ... you see that when we activate this filter, that the z-calculations update because some rows have been excluded (that's why we had to copy/paste those values).

jonathon