R2 and adjusted R2 for simple linear regression

Discuss statistics related things
Post Reply
Biochemist
Posts: 35
Joined: Mon Feb 19, 2024 9:44 am

R2 and adjusted R2 for simple linear regression

Post by Biochemist »

Hi,

As far as I understood, adjusted R2 is meant to make a correction for adding variables in a multiple linear regression model, which inevitably leads to a higher R2 value even if the variables do not really contribute much to the model and are rather "useless" in the model.

What I wonder now is: Why do R2 and adjusted R2 still differ a bit even for simple linear regression with just one independent X variable and the dependent Y variable? Is this always the case or is this just because of how jamovi calculates adjusted R2?

Does it even make sense to have adjusted R2 calculated for simple linear regression or should this only be used for multiple linear regression?
Bobafett
Posts: 84
Joined: Thu Jul 18, 2019 11:33 am

Re: R2 and adjusted R2 for simple linear regression

Post by Bobafett »

As I recall, R² reflects the proportion of the variance accounted for in your sample of participants, but the R²adj is for the population. I forget the exact formula, but the adjustment makes a more conservative estimate of this than the unadjusted R², and so the value you see in your analysis should if anything be slightly smaller.
User avatar
MAgojam
Posts: 436
Joined: Thu Jun 08, 2017 2:33 pm
Location: Parma (Italy)

Re: R2 and adjusted R2 for simple linear regression

Post by MAgojam »

Hey @Biochemist,
maybe you've already looked at the code here:
https://github.com/jamovi/jmv/blob/mast ... #L663-L704

In addition to what has already been mentioned (TA @Bobafet), it’s worth recalling the formal definitions of and adjusted R², and how these metrics are computed, even in the case of a simple linear regression with just one independent variable.
  • (Coefficient of Determination):
    is defined as the proportion of the variance in the dependent variable (Y) that is explained by the model.
    Mathematically, R² = 1 – (SSE/SST), where SSE (Sum of Squared Errors) is the sum of squared residuals, and SST (Total Sum of Squares) is the total variance in Y around its mean.
    A higher indicates that the model explains more variance in your sample data.
  • Adjusted R²:
    Adjusted R² accounts for the number of parameters estimated relative to the sample size.
    For a model with p predictors (excluding the intercept) and n observations, the formula is:
    R_adj² = 1 - (1 - R²) * ((n - 1) / (n - p - 1))
    Even in the case of simple linear regression (p = 1), adjusted R² will be slightly different from .
    This stems from the fact that adjusted R² attempts to correct for the upward bias in ’s estimate of the population-level explanatory power.
    While the difference may be small with just one predictor, it’s typically present and results in adjusted R² being equal to or slightly smaller than .
Why do they differ even for a single predictor?
When you estimate even a single slope parameter from your data, the adjusted R² incorporates a “penalty” for using sample data rather than known population parameters.
This penalty remains relevant even with a single predictor, providing a more conservative view of how well the model might generalize beyond your specific sample.
Although the discrepancy between and adjusted R² may be minimal in simple linear regression, the adjusted R² still has conceptual value.

Jamovi reports both and adjusted R² by default in its linear regression output. The software uses the standard formulas, so what you see is simply the direct application of the theoretical equations.
In the “Model Fit Measures” table (.populateModelFitTable() function), you can observe that adjusted R² is slightly lower than , reflecting a more cautious estimate of how well the model may perform in the broader population, not just the current sample.

In summary, even in simple linear regression, adjusted R² can offer insight, albeit with a smaller difference from compared to models with multiple predictors.
This difference reflects the statistical principle that sample-based estimates should be “shrunk” slightly to account for the uncertainty in estimating parameters from data.

Cheers,
Maurizio
https://www.jamovi.org/about.html
Biochemist
Posts: 35
Joined: Mon Feb 19, 2024 9:44 am

Re: R2 and adjusted R2 for simple linear regression

Post by Biochemist »

Thanks for your answers, especially for your very detailed and elaborate answer, Maurizio. That made it clear.
Post Reply