What is K-Nearest Neighbor(K-NN)?

A powerful Machine Learning Algorithm.

Manik Soni
4 min readSep 27, 2020

K-NN is a widely used machine learning classification technique. Basically , K-NN uses voting from the nearest neighbors and KNN is also be used for regression problems, but the difference is, in regression KNN will do averages of nearest neighbors not voting.

How K-Nearest Neighbor(K-NN) works?

So, the above plot is a categorization problem in which a particular category is defined as Category 1 and Category 2.

Now let us suppose for testing we add a new data point, now the question is where to fit this new data point in a particular category(either it is 1 or 2).

So for this, we follow some steps which are important to understand this KNN Algorithm.

Step 1. Choose the number K of neighbors. It means around the new data point how much neighbors to select for doing categorization. (We are assigning K=5)

Step 2: Take the K nearest neighbors of the new data point, according to the Euclidean Distance.

Euclidean Distance

Step 3: Among these K neighbors, count the number of data points in each category. We can analyze the number of neighbors in each category and in category 1 there are 3 neighbors and in category 2 there are 2 neighbors.

Step 4: Assign the new data point to the category where you counted the most neighbors. So, category 1 is having more number of neighbors. Hence, the New data point belongs to category 1.

Now, we will do the implementation part. We first import our data set of people who want to buy a specific product.

Dataset

We should follow the steps to build a KNN Algorithm.

Step 1. Import the Libraries

Import Libraries

Step 2. Importing the Dataset

Import Dataset

Step 3. Split the data into a matrix of features(X)(So we are taking ‘Age’ and ‘Salary’ into consideration to do Prediction) and the dependent variable(y).

Step 4. Splitting the matrix of features(X) and dependent variable(y) into training and test set.

Splitting of Test and Training set

Steps 5. Now we do Feature Scaling for ‘Age’ and ‘Salary’ column.

Code to apply feature scaling on the dataset

Step 6. Fitting a linear model to test and training dataset.

Fitting the linear model

Step 7. Predicting the Test result.

Predict the model

Step 8. Making the Confusion Matrix to do predictions.

confusion matrix
confusion matrix

Step 8. Visualization of Dataset.

--

--

No responses yet