Winsorize using sd parameters

Discuss the jamovi platform, possible improvements, etc.
Post Reply
rookiestats
Posts: 3
Joined: Thu Apr 13, 2023 12:52 pm

Winsorize using sd parameters

Post by rookiestats »

I have to treat some outliers in my data, while I am aware less treatment is better, I want to try Winsorizing my data.
How would I do this if I wanted to apply this treatment to any cases that are beyond -3.29/3.29 sd from the mean?
User avatar
MAgojam
Posts: 421
Joined: Thu Jun 08, 2017 2:33 pm
Location: Parma (Italy)

Re: Winsorize using sd parameters

Post by MAgojam »

Hey @rookiestats,
do you think the suggestion in the screenshot is useful for you?
winsorize.png
winsorize.png (208.45 KiB) Viewed 2248 times
Cheers,
Maurizio
rookiestats
Posts: 3
Joined: Thu Apr 13, 2023 12:52 pm

Re: Winsorize using sd parameters

Post by rookiestats »

Thanks @MAgojam,
I think this may help. But I'm just having a little trouble understand what your formula would do to your data specifically. I don't suppose you could explain each components of your formula?

grazie,
rookiestats
User avatar
MAgojam
Posts: 421
Joined: Thu Jun 08, 2017 2:33 pm
Location: Parma (Italy)

Re: Winsorize using sd parameters

Post by MAgojam »

rookiestats wrote: Fri Apr 14, 2023 1:37 am I don't suppose you could explain each components of your formula?
What I showed you in the screenshot was like obtain with computed new variable in jamovi, the winsorization of a variable, as you could have done in R with the winsorize() function of the R datawizard package.
This feature has the ability to winsorize by selecting from three possible methods.
The "zscore" method can provide a defined non-robust winsorization using mean and standard deviation, and a robust one using median and mad.
I've chosen to show you the non-robust version of the method.
The robust one is also possible, but a little longer to implement, because you don't already have a VMAD() available, as instead you have VMED() for the median of the variable of interest, or VMEAN() for the mean and VSTDEV() for the standard deviation.

So, referring to the screenshot variable, VMEAN(myvar) to get the mean of the myvar variable and VSTDEV(myvar) to get its standard deviation.

Thus it is possible to obtain the upper/lower cutoff value of myvar by adding/subtracting 20% of the standard deviation from the mean.

With the nested IF() functions, the myvar values that are between the two cutoff values are intercepted, to be left unchanged during the computation of the new wins_myvar variable, while the myvar values greater than the upper cutoff value will be replaced with the same upper cutoff value, while myvar values lower than the lower cutoff value will be replaced with the same lower cutoff value.

If you do not anticipate that the variables to be winsorized may undergo changes, to simplify, you can take and use what you need from a Descriptive analysis of the variables of interest.

Cheers,
Maurizio
User avatar
reason180
Posts: 276
Joined: Mon Jul 24, 2017 4:56 pm

Re: Winsorize using sd parameters

Post by reason180 »

It's my impression that Winsorization usually involves converting the scores in the top and bottom n-tile of a distribution (constituting the top n and bottom n) so that they become equal to the maximum or minimum non-converted score.

Original { 2, 4, 5, 7, 12, 15, 21, 21, 23, 97 }
Winsorized top and bottom 20% { 5, 5, 5, 7, 12, 15, 21, 21, 21, 21 }

See https://en.wikipedia.org/wiki/Winsorizing#:~:text=Winsorizing%20or%20winsorization%20is%20the,Winsor%20(1895%E2%80%931951).
User avatar
MAgojam
Posts: 421
Joined: Thu Jun 08, 2017 2:33 pm
Location: Parma (Italy)

Re: Winsorize using sd parameters

Post by MAgojam »

reason180 wrote: Sun Apr 16, 2023 6:14 pm It's my impression that Winsorization usually involves converting the scores in the top and bottom n-tile of a distribution...
Yes, this is also a method for winsoring a variable.
It is the method (percentile) that I recalled, but not indicated in my previous answer, which referred to a (simple) modality possible with compute.
Maybe this (attachment) can help you more on the subject.
winsorized.omv
(70.59 KiB) Downloaded 84 times
What do you think?

Cheers,
Maurizio
Post Reply