What is the Random Forest Classification?

from sklearn.ensemble import RandomForestClassifier, How Random Forest Classification works?

Manik Soni
3 min readSep 30, 2020
Random Forest Classification

Before we dive deep into Random Forest Classification we first analyze what is a decision tree? and how the algorithm works?

Prerequisite: What is Decision Tree Classification?

Random forest Classification is a Non-linear machine learning model. Just like decision tree classification helps us to do predictions on a particular split, but in decision tree classification only one tree helps to do a prediction. In random forest ’N’ number of trees is doing prediction and the average of all the results of the tree helps to give accurate results.

Averaging helps to improve predictive accuracy and control over-fitting.

Random forest classification is a type of ensemble learning technique where we take an average of all the results.

Steps to follow Random Forest Classification:

Step1. Pick at random ‘K’ data points from the Training set.

Step 2. Build the Decision Tree associated with these K data points.

Step 3. Choose the number ‘N tree’ of trees you want to build and repeat Steps 1&2.

Step 4. For new data points, make each one of your ‘N tree’ trees predict the value of Y to for the data point in question and assign the new data point to the category that wins the majority vote.

Now, we will do the implementation part. We first import our data set of people who want to buy a specific product.

Dataset

We should follow the steps to build a Random Forest Classification Algorithm.

Step 1. Import the Libraries

Import Libraries

Step 2. Importing the Dataset

Import Dataset

Step 3. Split the data into a matrix of features(X)(So we are taking ‘Age’ and ‘Salary’ into consideration to do Prediction) and the dependent variable(y).

Step 4. Splitting the matrix of features(X) and dependent variable(y) into training and test set.

Splitting of Test and Training set

Steps 5. Now we do Feature Scaling for ‘Age’ and ‘Salary’ column.

Code to apply feature scaling on the dataset

Step 6. Fitting a linear model to test and training dataset.

Step 7. Predicting the Test result.

Predict the model

Step 8. Making the Confusion Matrix to do predictions.

confusion matrix

Step 8. Visualization of Dataset.

--

--

No responses yet