Linear Regression - Robust standard error

Jorge · Post by **Jorge** » Thu Apr 01, 2021 8:57 pm

Dear Jamovi team,

Thank you for making jamovi available. I teach statistics to economics, business and international relations students and we've been using for (way too many) years SPSS at the undergraduate level.

I recently heard about jamovi and I am very interested in replacing SPSS by jamovi in my classes. However, in regression analysis, we use the robust standard error of the coefficients (in fact, the White-Huber estimator for standard error) a lot. This is a very importance feature for economics, business and international relations students as the robust standard error corrects for misspecifications and deals with heteroskedasticity.

While this is available in SPSS (in GLM), I couldn't find it in jamovi. In jamovi, we have the possibility of testing for homoskedasticity (with the moretests module), but it would be relevant to apply the robust estimator in case the test points out for the existence of heteroskedasticity.

Could you please help me and tell me how to find such a feature? In the event jamovi does not include that, could you please tell me if you plan to include it in next versions (because I am preparing the fall semester and would like to use jamovi)?

Thank you

Post by **jonathon** » Fri Apr 02, 2021 10:03 pm

hi jorge,

i don't know much about the particular measure. if you have time, i'm happy to work with you in having it included in moretests.

cheers

jonathon

Jorge · Post by **Jorge** » Sat Apr 03, 2021 6:10 pm

Hi Jonathon,

Thank you for your quick and kind reply.

I am sorry if I will write things that you are already familiar with, but I just want you to understand the issue. I will try to make it short:

The linear regression model Y=XB+u allows us to estimate coefficients B (Column estimate in jamovi) and the corresponding standard errors (Column SE in jamovi). This model only works if some assumptions are met, namely that u is homoscedastic (that is, the variance of u is constant).

If u does not have a constant variance (we say, in this case, that there is heteroskedasticity), then the coefficients B remain the same but statistical inference is not consistent (meaning, e.g., that t-tests for individual significance do not work) as the standard errors of the B coefficients are not correctly estimated.

Heteroskedasticity happens in 99.9% of applications in economics, business and international relations, which explains why several statistical software includes tests for this (so does jamovi – we have the well-known Breusch-Pagan test in moretests). In fact, heteroskedasticity is so common that most people in these areas simply use the robust estimator and don’t check it with the appropriate test. Journals in these areas always want us to replicate regression results with the robust estimator, because results without it may not be trusted.

With heteroskedasticity, we need to get the correct values of the standard errors of the coefficients so that we can compute the appropriate statistical significance of any independent variable. This is where the Huber-White estimator enters. Huber and White proposed an estimator to deal with heteroskedasticity, allowing for the correction of the standard errors and, therefore, validating statistical inference. For informal background, check the wikipedia article on this here:

https://en.wikipedia.org/wiki/Heterosce ... ard_errors

Several statistical software include this estimator for the standard errors – sadly, I do not work with R, but for instance in Stata it’s the vce(robust) option; in SPSS, you go to Analyze – generalized linear models – estimation and select robust estimator.

Although I don’t understand R, I searched this issue in google and found out that, apparently, R also has (some) functions to calculate the corrected standard errors: e.g., the hccm function – please see:

https://www.rdocumentation.org/packages ... juCHGzcY10

Also, here you have the comparison of results with this corrected estimator between the R and Stata procedures:

https://stats.stackexchange.com/questio ... aM9Q2OirZg

As jamovi is based on R, I hope this helps.

Please let me know how can I help you to implement this. It would really be wonderful to have this feature in regression analysis (included in moretests, for instance).

Cheers,
Jorge

EDIT: I found a paper that describes how to implement this in R, because it says the functions to use. I suppose it will be helpful. It is here:
https://cran.r-project.org/web/packages ... ndwich.pdf

Post by **jonathon** » Sun Apr 04, 2021 10:44 pm

thanks jorge,

could you create an issue on our github issues page? this is probably not something i'll get to soon, and putting it on the issues page will ensure it doesn't get forgotten.

with thanks

Jorge · Post by **Jorge** » Mon Apr 05, 2021 11:43 am

Hi Jonathon,

Thank you for your reply. I have created the issue on github - in addition to the text, I have also included a link to a paper that states the functions in R that are used to solve this issue.

Do you think it would be possible to have the robust standard error on jamovi by next september? I am asking because I would really like to use jamovi in my Stats course in the fall.

Thank you.

MAgojam · Post by **MAgojam** » Tue Apr 06, 2021 2:36 pm

Hi, @Jorge.
Lately I see the demands for "Robust Standard Errors" in jamovi have increased.
I think many of these could be by updating the moretest module, clean and elegant module for ready-to-use results.
The time available for the update, however, is always running out, but without abandoning the idea of using jamovi even when something that could be useful has to arrive, you can use another (fantastic) module.
I'm referring to Rj, where with a little bit of R code you can get the answers you are looking for, before packing and testing a module's functionality.
If it will be useful to you or to others, I report a function to be used in Rj to obtain robust standard errors, originally written by Jonh Fox (some time ago), which I modified for some colleagues waiting for the form, so that they can continue with jamovi.
This is the code, to copy and paste in Rj, but I am also attaching a simple file to see it working.

Code: Select all

## -------------------------------------------------
## This is a function originally written by John Fox 
## to which I have made some changes for use by Rj.
summary_rse <- function(object, typHC = c("hc0", "hc1", "hc2", "hc3", "hc4")){
    if (!require(car)) stop("Required car package is missing.")
    
    if(is.na(object$df.residual) || NROW(object$qr$qr) - object$rank != object$df.residual) 
        warning("Residual degrees of freedom in object suggest this is not an \"lm\" fit.")
    
    s_rse     <- summary(object)
    
    table     <- coef(s_rse)
    table[,2] <- sqrt(diag(hccm(object, type = match.arg(typHC))))
    table[,3] <- table[,1]/table[,2]
    table[,4] <- 2*pt(abs(table[,3]), df.residual(object), lower.tail=FALSE)
    
    s_rse$coefficients <- table
    
    hyp <- cbind(0, diag(nrow(table) - 1))
    s_rse$fstatistic[1] <- linearHypothesis(object, 
                                            hyp, 
                                            white.adjust = match.arg(typHC))[2,"F"]
    
    dimnames(s_rse$coefficients) <- list(names(object$coefficients)[object$qr$pivot],
                                         c("Estimate", 
                                           paste0("SE[", match.arg(typHC),"]"), 
                                           "t value", 
                                           "Pr(>|t|)"))
    return(s_rse)
}

Rj_rse_auto.omv: (9.79 KiB) Downloaded 2304 times

Cheers,
Maurizio

Jorge · Post by **Jorge** » Wed Apr 07, 2021 8:43 am

Dear Maurizio,

Thank you for your kind reply and for sharing this with me.

Unfortunately, I don't work with R (only with SPSS and Stata). Thus, could you please tell me what do I have to do with this code?

I installed the Rj module. Then I opened one of my simple datasets and run a linear regression. Then, I opened Rj editor and pasted the code you provided (and clicked run). Nevertheless, nothing happened. For sure, I am doing something wrong, as your sample file seems to present the corrected standard errors.

I would be grateful if you could give me details on this.

Cheers,
Jorge

MAgojam · Post by **MAgojam** » Wed Apr 07, 2021 4:56 pm

Jorge wrote:I would be grateful if you could give me details on this.

Hi Jorge,
with pleasure.

In the attached screenshot you will see a simple comparison between Stata and Jamovi with the R environment of the Rj module.

: STATA_RJ_rse.PNG (233.12 KiB) Viewed 49655 times

The file used in jamovi in the previous post is a file that you can recall in Stata (sysuse auto).
Always in State with: regress mpg turn trunk
gets an output with Std. Err. (Not Robust)
while with: regress mpg turn trunk, vce(robust)
you will get an output with Robust St. Err.
In Rj you have the possibility to use an R environment and the relative libraries of functions that jamovi brings with its installation.

In Rj with data you refer to the file you have opened in jamovi and with the following line of code:
lm_model <- lm(mpg ~ turn + trunk, data = data)
using the assignment operators "<-" you pass the object of the regression function
lm(mpg ~ turn + trunk, data = data)
to the lm_model variable (choose a name that remembers what it is), which can be used individually or in other functions, without the need to repeat the regression again.

To obtain an output similar to that of Stata
regress mpg turn trunk
with (NOT Robust) Std. Err. use this line of code:
summary(lm_model)

To obtain an output similar to that of Stata
regress mpg turn trunk, vce(robust)
with Robust Std. Err. use the new function above with this line of code:
summary_rse(lm_model, "hc1")

Cheers,
Maurizio

Jorge · Post by **Jorge** » Wed Apr 07, 2021 6:42 pm

Dear Maurizio,

Thank you for your detailed explanation. Although I am not familiar with R, I followed the steps and it worked beautifully. I have also tried with another dataset and the results are exactly the same I got with Stata!

Your post was really helpful.

Until jamovi includes this item, I suppose I can have this code in, say, a word document and each time my students want to estimate the model with robust standard errors, they just have to change the name of the variables and copy the text to jamovi/Rj editor.

Still, I hope that Jonathon incorporates this feature in jamovi soon. Given that R codes are already available, I suppose that will not take much time, right Jonathon?

Thanks again

Cheers,
Jorge

Post by **jonathon** » Thu Apr 08, 2021 1:15 am

i am not sure.

i would need to look into it further.

jonathon

jamovi

Linear Regression - Robust standard error

Linear Regression - Robust standard error

Re: Linear Regression - Robust standard error

Re: Linear Regression - Robust standard error

Re: Linear Regression - Robust standard error

Re: Linear Regression - Robust standard error

Re: Linear Regression - Robust standard error

Re: Linear Regression - Robust standard error

Re: Linear Regression - Robust standard error

Re: Linear Regression - Robust standard error

Re: Linear Regression - Robust standard error

Login • Register