









Simple linear regression is a statistical method used to model and analyze the relationship between two continuous variables. Specifically, it aims to predict the value of one variable (the dependent or response variable) based on the value of another variable (the independent or predictor variable). The relationship is assumed to be linear, meaning it can be described with a straight line.
In simple linear regression, there is only one independent variable. The relationship between and
is modeled by a straight line:
Where:
is the dependent variable (the output).
is the independent variable (the input).
is the intercept (the value of
when
).
is the slope of the line (the change in
for a unit change in
).
is the error term (the difference between the observed and predicted values of
).
The formulas to estimate the slope () and intercept (
) are:
Where:
is the mean of the
values.
is the mean of the
values.
and
are the individual data points.
Example: Sunlight & Selfies
Sunlight (hours) | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
Number of Selfies | 5 | 10 | 12 | 15 | 20 |
Calculate the means:
Calculate the slope ():
Calculate the intercept ():
The estimated linear regression model is:
Codes in Python
import numpy as np
# Given data
x = np.array([1, 2, 3, 4, 5])
y = np.array([5, 10, 12, 15, 20])
# Calculate the means of x and y
mean_x = np.mean(x)
mean_y = np.mean(y)
# Calculate the slope (beta_1)
numerator = np.sum((x - mean_x) * (y - mean_y))
denominator = np.sum((x - mean_x) ** 2)
beta_1 = numerator / denominator
# Calculate the intercept (beta_0)
beta_0 = mean_y - beta_1 * mean_x
print(f"The estimated linear regression model is: y = {beta_0:.2f} + {beta_1:.2f}x")
R codes:
# Given data
x <- c(1, 2, 3, 4, 5)
y <- c(5, 10, 12, 15, 20)
# Calculate the means of x and y
mean_x <- mean(x)
mean_y <- mean(y)
# Calculate the slope (beta_1)
numerator <- sum((x - mean_x) * (y - mean_y))
denominator <- sum((x - mean_x)^2)
beta_1 <- numerator / denominator
# Calculate the intercept (beta_0)
beta_0 <- mean_y - beta_1 * mean_x
cat("The estimated linear regression model is: y =", beta_0, "+", beta_1, "x\n")
Discover more from Science Comics
Subscribe to get the latest posts sent to your email.