Linear Discriminant Analysis (LDA) is a classifier that creates a linear decision boundary by fitting class-conditional densities to the data and applying Bayes’ rule. The model assumes that each class follows a Gaussian distribution with a shared covariance matrix across all classes. Additionally, LDA can reduce the dimensionality of the input data by projecting it onto the most discriminative directions through the transform method.
Python implementation
Using the scikit-learn
library in Python, you can perform LDA as follows:
- Import Libraries: Import necessary libraries for LDA, data handling, and evaluation.
- Load Dataset: Load the Iris dataset.
- Split Data: Split the dataset into training and testing sets.
- Create and Fit Model: Initialize the LDA model and fit it to the training data.
- Predict: Use the model to predict the labels for the test data.
- Evaluate: Calculate the accuracy of the model’s predictions.
Codes:
# Import necessary libraries
import numpy as np
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Load the dataset
data = load_iris()
X = data.data
y = data.target
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Create the LDA model
lda = LDA()
# Fit the model to the training data
lda.fit(X_train, y_train)
# Predict the labels for the test set
y_pred = lda.predict(X_test)
# Calculate the accuracy of the model
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy * 100:.2f}%')
Implementation of Linear Discriminant Analysis in R
Using the MASS
library in R, you can perform LDA as follows:
- Load Libraries: Install and load the
MASS
library, which provides thelda
function. - Load Dataset: Load the Iris dataset.
- Split Data: Split the dataset into training and testing sets.
- Create and Fit Model: Initialize the LDA model using the training data.
- Predict: Use the model to predict the labels for the test data.
- Evaluate: Calculate the accuracy of the model’s predictions.
Codes:
# Install and load necessary library
install.packages("MASS")
library(MASS)
# Load the dataset
data(iris)
# Split the dataset into training and testing sets
set.seed(42)
train_indices <- sample(1:nrow(iris), size = 0.7 * nrow(iris))
train_data <- iris[train_indices, ]
test_data <- iris[-train_indices, ]
# Create the LDA model
lda_model <- lda(Species ~ ., data = train_data)
# Predict the labels for the test set
predictions <- predict(lda_model, test_data)
predicted_labels <- predictions$class
# Calculate the accuracy of the model
accuracy <- mean(predicted_labels == test_data$Species)
print(paste('Accuracy:', round(accuracy * 100, 2), '%'))
Discover more from Science Comics
Subscribe to get the latest posts sent to your email.