Custom distance function in KNN
To use KNeighborsClassifier
with a custom distance function, you can utilize the metric='pyfunc'
parameter and directly pass your custom distance function. Here is example codes:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
import numpy as np
# Define custom distance function
def custom_distance(x, y):
return np.sqrt(np.sum((x - y) ** 2)) # Example: Euclidean distance
# Load dataset
data = load_iris()
X, y = data.data, data.target
# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Create KNN classifier with custom distance metric
knn = KNeighborsClassifier(n_neighbors=5, weights='distance', metric='pyfunc', metric_params={"func": custom_distance})
# Fit the model
knn.fit(X_train, y_train)
# Make predictions
y_pred = knn.predict(X_test)
# Evaluate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
Explanation:
- The
metric
parameter is set to'pyfunc'
, which allows for custom Python functions. - The
metric_params={"func": custom_distance}
argument specifies your custom distance function to be used by the classifier.
Using weights in KNN (Weighted KNN)
Weighted K-Nearest Neighbors (Weighted KNN) is a powerful and intuitive variation of the classic K-Nearest Neighbors (KNN) algorithm used in classification and regression tasks. Unlike the standard KNN, which assigns equal importance to all neighbors, Weighted KNN introduces the idea of assigning weights to the neighbors based on their distances from the query point. This modification ensures that closer neighbors have a stronger influence on the outcome than those further away.
To implement Weighted KNN using scikit-learn, you can use the KNeighborsClassifier
class and set the weights
parameter to "distance"
for weighting neighbors by their distance. Here’s a basic example:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
# Load dataset
data = load_iris()
X, y = data.data, data.target
# Split dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Create Weighted KNN classifier
knn = KNeighborsClassifier(n_neighbors=5, weights='distance')
# Fit the model
knn.fit(X_train, y_train)
# Make predictions
y_pred = knn.predict(X_test)
# Evaluate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
In this example:
weights='distance'
ensures closer neighbors have more influence.- You can customize the number of neighbors with
n_neighbors
. - The
accuracy_score
helps evaluate the performance of the model.
Customizing the KNN method
In this example, I will demonstrate how to create a simple custom K-Nearest Neighbors (KNN) classifier class that extends the KNeighborsClassifier
from sklearn.neighbors
. The goal here is to illustrate how you can leverage object-oriented programming (OOP) principles like inheritance to customize and extend existing classes.
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
class CustomKNN(KNeighborsClassifier):
def __init__(self, n_neighbors=5, **kwargs):
super().__init__(n_neighbors=n_neighbors, **kwargs)
def fit(self, X, y):
super().fit(X, y)
# After fitting, print a summary
print(f"CustomKNN fitted with {self.n_neighbors} neighbors.")
# Calculate and print training accuracy
train_predictions = self.predict(X)
train_accuracy = accuracy_score(y, train_predictions)
print(f"Training Accuracy: {train_accuracy:.2f}")
return self
# Example usage
if __name__ == "__main__":
# Load a dataset
iris = load_iris()
X = iris.data
y = iris.target
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create and train the custom KNN model
custom_knn = CustomKNN(n_neighbors=3)
custom_knn.fit(X_train, y_train)
# Make predictions on the test set
y_pred = custom_knn.predict(X_test)
# Calculate and print test accuracy
test_accuracy = accuracy_score(y_test, y_pred)
print(f"Test Accuracy: {test_accuracy:.2f}")
This example demonstrates several OOP concepts:
- Inheritance: The
CustomKNN
class inherits fromKNeighborsClassifier
, allowing it to reuse existing functionality. - Method Overriding: The
fit
method is overridden to extend its functionality by adding a summary printout after training. - Encapsulation: The internal workings of
KNeighborsClassifier
are hidden, and we only interact with the public methods and properties (fit
,predict
,n_neighbors
). - Abstraction: By creating a custom class, we abstract away the details of how the KNN works and provide a simpler interface for users.
This approach allows you to easily extend and customize existing machine learning models in scikit-learn
while maintaining the flexibility and structure provided by object-oriented programming.
Discover more from Science Comics
Subscribe to get the latest posts sent to your email.