K-Nearest Neighbors (KNN): an introduction

Subscribe to get access

??Subscribe to read the rest of the comics, the fun you can’t miss ??

K-Nearest Neighbors (KNN) is a popular algorithm used for both classification and regression tasks. In KNN, the output is a class membership, which is assigned based on the majority of the k nearest data points. One of the advantages of KNN is its ability to adapt to new training data. Additionally, it doesn’t make any assumptions about the underlying data distribution, which can be beneficial in certain scenarios.

Example

Consider an example of using the K-Nearest Neighbors (KNN) algorithm with a small dataset with 5 samples and 2 input features. Let’s assume we are classifying the samples into two classes.

SampleFeature 1Feature 2Class
A1.01.00
B2.02.00
C3.03.01
D4.04.01
E5.05.01

Steps to Implement KNN:

  1. Choose the value of K: Let’s choose K = 3 .
  2. Calculate the distance between the point to be classified and all other points in the dataset. We’ll use Euclidean distance.
  3. Sort the distances and determine the nearest neighbors based on the chosen K value.
  4. Vote for the classes of the nearest neighbors.
  5. Assign the class with the highest vote to the point to be classified.

Example: Let’s classify a new point (x, y) = (3.5, 3.5) .

Step 1: Calculate Distances

\text{Distance from (3.5, 3.5) to (1.0, 1.0)} = \sqrt{(3.5-1.0)^2 + (3.5-1.0)^2} = \sqrt{2.5^2 + 2.5^2} = \sqrt{6.25 + 6.25} = \sqrt{12.5} \approx 3.54

\text{Distance from (3.5, 3.5) to (2.0, 2.0)} = \sqrt{(3.5-2.0)^2 + (3.5-2.0)^2} = \sqrt{1.5^2 + 1.5^2} = \sqrt{2.25 + 2.25} = \sqrt{4.5} \approx 2.12

\text{Distance from (3.5, 3.5) to (3.0, 3.0)} = \sqrt{(3.5-3.0)^2 + (3.5-3.0)^2} = \sqrt{0.5^2 + 0.5^2} = \sqrt{0.25 + 0.25} = \sqrt{0.5} \approx 0.71

\text{Distance from (3.5, 3.5) to (4.0, 4.0)} = \sqrt{(3.5-4.0)^2 + (3.5-4.0)^2} = \sqrt{(-0.5)^2 + (-0.5)^2} = \sqrt{0.25 + 0.25} = \sqrt{0.5} \approx 0.71

\text{Distance from (3.5, 3.5) to (5.0, 5.0)} = \sqrt{(3.5-5.0)^2 + (3.5-5.0)^2} = \sqrt{(-1.5)^2 + (-1.5)^2} = \sqrt{2.25 + 2.25} = \sqrt{4.5} \approx 2.12

Step 2: Sort Distances:
C : 0.71
D : 0.71
B : 2.12
E : 2.12
A : 3.54

Step 3: Determine Nearest Neighbors: The 3 nearest neighbors to (3.5, 3.5) are:

  • C (0.71)
  • D (0.71)
  • B (2.12)

Step 4: Vote for Classes: The classes of the 3 nearest neighbors are:

  • C: Class 1
  • D: Class 1
  • B: Class 0

The votes are:

  • Class 0: 1 vote
  • Class 1: 2 votes

Step 5: Assign Class: The new point (3.5, 3.5) is assigned to Class 1 since it received the highest number of votes.


Discover more from Science Comics

Subscribe to get the latest posts sent to your email.

Leave a Reply

error: Content is protected !!