Subscribe to get access
??Subscribe to read the rest of the comics, the fun you can’t miss ??
The coefficient of determination, commonly known as (), is a measure of how well the model explains the variability of the response variable. It ranges from 0 to 1, where 1 indicates that the model perfectly explains the variance, and 0 indicates that the model does not explain any variance.
Here’s a step-by-step example of computing the coefficient of determination in Python, where we’ll use scikit-learn
to perform linear regression and compute ():
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
import matplotlib.pyplot as plt
# Generate sample data
np.random.seed(42)
X = np.linspace(0, 10, 100).reshape(-1, 1) # Feature
y = 3 * X.flatten() + np.random.normal(0, 1, 100) # Target with some noise
# Fit the model
model = LinearRegression()
model.fit(X, y)
# Predict
y_pred = model.predict(X)
# Compute R^2 score
r2 = r2_score(y, y_pred)
print(f"Coefficient of Determination (R^2) using scikit-learn: {r2:.2f}")
# Plotting
plt.scatter(X, y, color='blue', label='Data')
plt.plot(X, y_pred, color='red', linewidth=2, label='Fitted line')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Linear Regression and R^2')
plt.legend()
plt.show()
R codes
# Load the ggplot2 package
library(ggplot2)
# Generate sample data
set.seed(42)
X <- seq(0, 10, length.out = 100)
y <- 3 * X + rnorm(100) # Target with some noise
df <- data.frame(X = X, y = y)
# Fit the linear model
model <- lm(y ~ X, data = df)
# Summary of the model to get R^2
summary(model)
# Extract R^2
r_squared <- summary(model)$r.squared
cat("Coefficient of Determination (R^2) using ggplot2: ", round(r_squared, 2), "\n")
# Plotting using ggplot2
ggplot(df, aes(x = X, y = y)) +
geom_point() +
geom_smooth(method = "lm", color = "red", se = FALSE) +
ggtitle("Linear Regression and R^2") +
xlab("X") +
ylab("y") +
annotate("text", x = 1, y = max(df$y), label = paste("R^2 =", round(r_squared, 2)), hjust = 0)
The output is

So you can see that the coefficient of determination is high and the plot indicates a good fit.
Discover more from Science Comics
Subscribe to get the latest posts sent to your email.