Documentation

Coefficient of Determination (R-Squared)

Purpose

Coefficient of determination (R-squared) indicates the proportionate amount of variation in the response variable y explained by the independent variables X in the linear regression model. The larger the R-squared is, the more variability is explained by the linear regression model.

Definition

R-squared is the proportion of the total sum of squares explained by the model. Rsquared, a property of the fitted model, is a structure with two fields:

• Ordinary — Ordinary (unadjusted) R-squared

${R}^{2}=\frac{SSR}{SST}=1-\frac{SSE}{SST}.$

• Adjusted — R-squared adjusted for the number of coefficients

${R}_{adj}^{2}=1-\left(\frac{n-1}{n-p}\right)\frac{SSE}{SST}.$

SSE is the sum of squared error, SSR is the sum of squared regression, SST is the sum of squared total, n is the number of observations, and p is the number of regression coefficients. Note that p includes the intercept, so for example, p is 2 for a linear fit. Because R-squared increases with added predictor variables in the regression model, the adjusted R-squared adjusts for the number of predictor variables in the model. This makes it more useful for comparing models with a different number of predictors.

How To

After obtaining a fitted model, say, mdl, using fitlm or stepwiselm, you can obtain either R-squared value as a scalar by indexing into the property using dot notation, for example,

mdl.Rsquared.Ordinary

You can also obtain the SSE, SSR, and SST using the properties with the same name.

mdl.SSE
mdl.SSR
mdl.SST

Display Coefficient of Determination

This example shows how to display R-squared (coefficient of determination) and adjusted R-squared. Load the sample data and define the response and independent variables.

y = hospital.BloodPressure(:,1);
X = double(hospital(:,2:5));

Fit a linear regression model.

mdl = fitlm(X,y)
mdl =
Linear regression model:
y ~ 1 + x1 + x2 + x3 + x4

Estimated Coefficients:
Estimate        SE        tStat        pValue
_________    ________    ________    __________

(Intercept)        117.4      5.2451      22.383    1.1667e-39
x1               0.88162      2.9473     0.29913       0.76549
x2               0.08602     0.06731       1.278       0.20438
x3             -0.016685    0.055714    -0.29947       0.76524
x4                 9.884      1.0406       9.498    1.9546e-15

Number of observations: 100, Error degrees of freedom: 95
Root Mean Squared Error: 4.81
R-squared: 0.508,  Adjusted R-Squared: 0.487
F-statistic vs. constant model: 24.5, p-value = 5.99e-14

The R-squared and adjusted R-squared values are 0.508 and 0.487, respectively. Model explains about 50% of the variability in the response variable.

Access the R-squared and adjusted R-squared values using the property of the fitted LinearModel object.

mdl.Rsquared.Ordinary
ans = 0.5078