PCA analysis and log transformation of data

Discuss statistics related things

by ACT2021 » Tue May 11, 2021 10:45 am

I have a couple of questions regarding the cleaning of data before importing into Jamovi for PCA,

My data is multi-variate, compositional geochemical data. Missing values, censored data and units have already been cleaned (all in ppm). Some measurements are very small, but still important to discern. Some of my variable are not normally distributed.

Do I need to log-transform my data before I carry out PCA in Jamovi?

If I do need to transform it, I spotted the 'readthedocs' article which refers to LOG10 transform, but I am very confused as to how I add the correct transform formula into Jamovi (the examples seem to focus on categories, not log ratio transformations or don't quite go far enough in terms of showing an example for this).

Any help would be very much appreciated.
ACT2021
 
Posts: 1
Joined: Tue May 11, 2021 9:11 am

by DavoFromDapto » Wed Jun 02, 2021 5:04 am

I'm new to Jamovi, and so I'll respond only to the statistics/data cleaning part of your questions.

No you don't need to do a log transformation before a PCA.

There are several rationales for transforming variables. One is highly skewed data. The normal distribution is symmetric. One good and under-used measure of symmetry is the skewness statistic. The reference value for the normal distribution is sk=0. Take a look at the skewness coefficients (sk) of your variables. Monte carlo studies suggest that variables where |sk| < 2 can be treated "as if" they are normal. Variables with positive skew; that is where sk > 2, can be transformed with a log transformation. (It doesn't matter whether the log(10) or log(e) transformation is used.) The log transformation will reduce the skewness, typically below 2. There is nothing special about 2, it is a rule-of-thumb or guideline, rather than a benchmark.
DavoFromDapto
 
Posts: 5
Joined: Wed Jun 02, 2021 4:38 am


Return to Statistics

cron