What is the Random Forest Classification?
from sklearn.ensemble import RandomForestClassifier, How Random Forest Classification works?
Before we dive deep into Random Forest Classification we first analyze what is a decision tree? and how the algorithm works?
Prerequisite: What is Decision Tree Classification?
Random forest Classification is a Non-linear machine learning model. Just like decision tree classification helps us to do predictions on a particular split, but in decision tree classification only one tree helps to do a prediction. In random forest ’N’ number of trees is doing prediction and the average of all the results of the tree helps to give accurate results.
Averaging helps to improve predictive accuracy and control over-fitting.
Random forest classification is a type of ensemble learning technique where we take an average of all the results.
Steps to follow Random Forest Classification:
Step1. Pick at random ‘K’ data points from the Training set.
Step 2. Build the Decision Tree associated with these K data points.
Step 3. Choose the number ‘N tree’ of trees you want to build and repeat Steps 1&2.
Step 4. For new data points, make each one of your ‘N tree’ trees predict the value of Y to for the data point in question and assign the new data point to the category that wins the majority vote.
Now, we will do the implementation part. We first import our data set of people who want to buy a specific product.
We should follow the steps to build a Random Forest Classification Algorithm.
Step 1. Import the Libraries
Step 2. Importing the Dataset
Step 3. Split the data into a matrix of features(X)(So we are taking ‘Age’ and ‘Salary’ into consideration to do Prediction) and the dependent variable(y).
Step 4. Splitting the matrix of features(X) and dependent variable(y) into training and test set.
Steps 5. Now we do Feature Scaling for ‘Age’ and ‘Salary’ column.
Step 6. Fitting a linear model to test and training dataset.
Step 7. Predicting the Test result.
Step 8. Making the Confusion Matrix to do predictions.
Step 8. Visualization of Dataset.