Motivation
Now, recall that for LASSO
Ridge Regression:
Ridge regression:
Ridge adds the penalty, which is the sum of the squares of the coefficients, to the loss function in linear regression. Ridge regression shrinks the coefficients but does not set any of them to zero, meaning it retains all features but reduces their impact. It is useful for dealing with multicollinearity (when features are highly correlated). Its objective function is:
where are the actual values,
are the predicted values,
are the coefficients,
is the number of observations,
is the number of features, and
is the regularization parameters
Choosing Between Lasso and Ridge:
L1 versus L2:
- Lasso: Preferred when you expect only a few features to be important, and it is particularly useful in situations where you aim to reduce overfitting in your model. It performs feature selection by shrinking the less important feature coefficients to zero, yielding simpler and more interpretable models that can enhance performance on unseen data.
- Ridge: Preferred when you expect many features to contribute to the outcome, especially in scenarios involving complex datasets with numerous predictor variables. It helps in managing multicollinearity effectively and maintaining all features within the model, thus ensuring that every contributing factor retains its influence on the predicted results, leading to potentially more robust predictive performance overall.
Implementation:
In the following codes, we will follow these steps:
Step 1: Install and Load Packages
This installs and loads the required libraries for Ridge regression.
install.packages("glmnet")
install.packages("caret")
install.packages("e1071")
library(glmnet)
library(caret)
library(e1071)
Step 2: Generate Synthetic Data
Creates a synthetic dataset with 100 observations and 20 features.
set.seed(42)
n <- 100
p <- 20
X <- matrix(rnorm(n * p), n, p) # Generate random features
beta <- rnorm(p) # Random coefficients
y <- X %*% beta + rnorm(n) * 0.1 # Compute target variable with noise
Step 3: Split Data into Training and Test Sets
Divides the dataset into 80% training and 20% testing.
trainIndex <- createDataPartition(y, p = 0.8, list = FALSE)
X_train <- X[trainIndex, ]
X_test <- X[-trainIndex, ]
y_train <- y[trainIndex]
y_test <- y[-trainIndex]
Step 4: Train Ridge Regression Model
Applies Ridge regression (alpha = 0
) with cross-validation.
alpha <- 0 # Ridge regression (L2 penalty)
ridge_model <- cv.glmnet(X_train, y_train, alpha = alpha)
Step 5: Evaluate the Model
Generates predictions and computes performance metrics.
y_train_pred <- predict(ridge_model, X_train, s = "lambda.min")
y_test_pred <- predict(ridge_model, X_test, s = "lambda.min")
train_mse <- mean((y_train - y_train_pred)^2) # Mean Squared Error for training set
test_mse <- mean((y_test - y_test_pred)^2) # Mean Squared Error for test set
train_r2 <- 1 - sum((y_train - y_train_pred)^2) / sum((y_train - mean(y_train))^2) # R² for training set
test_r2 <- 1 - sum((y_test - y_test_pred)^2) / sum((y_test - mean(y_test))^2) # R² for test set
cat("Training set evaluation:\n")
cat("Mean Squared Error:", train_mse, "\n")
cat("R^2 Score:", train_r2, "\n")
cat("\nTest set evaluation:\n")
cat("Mean Squared Error:", test_mse, "\n")
cat("R^2 Score:", test_r2, "\n")
Step 6: Visualize Ridge Regression Coefficients
Plots the model’s coefficients to assess feature importance.
ridge_coefficients <- coef(ridge_model, s = "lambda.min")
plot(ridge_coefficients, main = "Ridge Regression Coefficients",
xlab = "Feature Index", ylab = "Coefficient Value",
pch = 16, col = "blue")
Complete code:
# Step 1: Install and load the necessary packages
install.packages("glmnet")
install.packages("caret")
install.packages("e1071")
library(glmnet)
library(caret)
library(e1071)
# Step 2: Load the dataset (using a synthetic dataset for this example)
set.seed(42)
n <- 100
p <- 20
X <- matrix(rnorm(n * p), n, p)
beta <- rnorm(p)
y <- X %*% beta + rnorm(n) * 0.1
# Step 3: Split the dataset into training and test sets
trainIndex <- createDataPartition(y, p = 0.8, list = FALSE)
X_train <- X[trainIndex, ]
X_test <- X[-trainIndex, ]
y_train <- y[trainIndex]
y_test <- y[-trainIndex]
# Step 4: Train the Ridge regression model
alpha <- 0 # Ridge regression (L2 penalty)
ridge_model <- cv.glmnet(X_train, y_train, alpha = alpha)
# Step 5: Evaluate the model
y_train_pred <- predict(ridge_model, X_train, s = "lambda.min")
y_test_pred <- predict(ridge_model, X_test, s = "lambda.min")
train_mse <- mean((y_train - y_train_pred)^2)
test_mse <- mean((y_test - y_test_pred)^2)
train_r2 <- 1 - sum((y_train - y_train_pred)^2) / sum((y_train - mean(y_train))^2)
test_r2 <- 1 - sum((y_test - y_test_pred)^2) / sum((y_test - mean(y_test))^2)
cat("Training set evaluation:\n")
cat("Mean Squared Error:", train_mse, "\n")
cat("R^2 Score:", train_r2, "\n")
cat("\nTest set evaluation:\n")
cat("Mean Squared Error:", test_mse, "\n")
cat("R^2 Score:", test_r2, "\n")
# Step 6: Visualize the results (if applicable, here we'll plot the coefficients)
ridge_coefficients <- coef(ridge_model, s = "lambda.min")
plot(ridge_coefficients, main = "Ridge Regression Coefficients", xlab = "Feature Index", ylab = "Coefficient Value", pch = 16, col = "blue")
Output:
Training set evaluation:
Mean Squared Error: 0.04788766
R^2 Score: 0.9979284
Test set evaluation:
Mean Squared Error: 0.08317506
R^2 Score: 0.9952078

Discover more from Science Comics
Subscribe to get the latest posts sent to your email.