What is the Random Forest Classification?

from sklearn.ensemble import RandomForestClassifier, How Random Forest Classification works?

3 min readSep 30, 2020

Before we dive deep into Random Forest Classification we first analyze what is a decision tree? and how the algorithm works?

Prerequisite: What is Decision Tree Classification?

Random forest Classification is a Non-linear machine learning model. Just like decision tree classification helps us to do predictions on a particular split, but in decision tree classification only one tree helps to do a prediction. In random forest ’N’ number of trees is doing prediction and the average of all the results of the tree helps to give accurate results.

Averaging helps to improve predictive accuracy and control over-fitting.

Random forest classification is a type of ensemble learning technique where we take an average of all the results.

Steps to follow Random Forest Classification:

Step1. Pick at random ‘K’ data points from the Training set.

Step 2. Build the Decision Tree associated with these K data points.

Step 3. Choose the number ‘N tree’ of trees you want to build and repeat Steps 1&2.

Step 4. For new data points, make each one of your ‘N tree’ trees predict the value of Y to for the data point in question and assign the new data point to the category that wins the majority vote.

Now, we will do the implementation part. We first import our data set of people who want to buy a specific product.

We should follow the steps to build a Random Forest Classification Algorithm.

Step 1. Import the Libraries

Step 2. Importing the Dataset

Step 3. Split the data into a matrix of features(X)(So we are taking ‘Age’ and ‘Salary’ into consideration to do Prediction) and the dependent variable(y).