Page 1 of 1

Multivariate outliers

Posted: Sat Oct 19, 2019 10:43 pm
by Dilara_
How can I detect multivariate outlier in my data set? I want to use Mahalanobis distance but it is not given in jamovi. How can I run it manually?

Re: Multivariate outliers

Posted: Sat Oct 19, 2019 11:37 pm
by jonathon
hi,

perhaps with the rj editor?

https://blog.jamovi.org/2018/07/30/rj.html

kind regards

jonathon

Re: Multivariate outliers

Posted: Tue Oct 22, 2019 8:33 am
by MAgojam
Hi, @Dilara_.
I am attaching a screenshot with a simple example of using the RJ editor (as suggested by Jonathon).
Perhaps the script with the Mahalanobis function can be useful to answer your question?
ScreenShot.png
ScreenShot.png (118.16 KiB) Viewed 20967 times
Cheers.
Maurizio

Re: Multivariate outliers

Posted: Tue Oct 22, 2019 12:32 pm
by Dilara_
I'll try it. Thank you

Re: Multivariate outliers

Posted: Fri Jul 09, 2021 2:57 am
by DeborahA
Hi there! I tried implementing this code but I could not for the life of me get it to work. So instead here's some simplified code that I wrote to achieve this for a student - I thought I might as well share it here.

Code: Select all

library(jmv)
library(dplyr)
library(magrittr)

dat <- select(data, "V1", "V2", "V3", "V4") # select only the variables you want to use 

Sx <- cov(dat) # get the covariance matrix 

D2 <- mahalanobis(dat, colMeans(dat), Sx) # calculate the Mahalnobis distances on the centred data

# Optionally, make some fancy plots 
plot(density(D2, bw = 0.5),
     main="Squared Mahalanobis distances, n=100, p=3") ; rug(D2)
qqplot(qchisq(ppoints(100), df = 3), D2,
       main = expression("Q-Q plot of Mahalanobis" * ~D^2 *
                         " vs. quantiles of" * ~ chi[3]^2))
abline(0, 1, col = 'gray')

# Add the distances to your selected dataset and calculate whether there are outliers greater than 4.5

dat$mahalanobis <-D2
dat$outlier <- FALSE
dat$outlier[dat$mahalanobis > 4.5] <- TRUE

dat
I hope this helps someone!

It would be great to see this implemented in jamovi some time soon, as our Honours students are expected to calculate this.

Best

Deborah.

Re: Multivariate outliers

Posted: Thu Aug 05, 2021 5:44 pm
by Claire1998
Hi there,

I am trying to use the code for RJ editor to calculate MD - I have a large data set with approx 100 variables, is there a way I can number the columns or is there a way to accurately input the desired column number into the code without counting them manually?

Thanks

Re: Multivariate outliers

Posted: Mon Aug 09, 2021 8:01 am
by jonathon
hey,

in R you can refer to columns by name, if that's easier than refering to them by column number.

cheers