What is Decision Tree Classification?

from sklearn.tree import DecisionTreeClassifier, How does boost works with decision tree classification algorithms? , What is the decision tree classification in Python?

Manik Soni
4 min readSep 30, 2020

Decision Tree Classification is the most powerful and popular tool for doing the classification of a dataset. A decision tree can be learned by splitting the source set into subsets based on the attribute value.

For example, if we want to do the decision tree classification then consider a dataset:

Now, let us perform decision tree classification and our result will look like this:

So, performing the decision tree classification stepwise.

Step 1.Split the dataset having a value less than 60 and greater than 60based upon the information gain or we can say that based upon the entropy.

We are doing the visualization of the decision tree that How the machine will do the classification?

Step 2. Again split the dataset based on information gain or entropy. Split having values less than and greater than 50 but X2 having a value greater than 60.

Step 3. Split based on values having X1 less than 70 and greater than 70 but X2 less than 60.

Step 4. Split points having values less than 20 and greater than 20 but having X1 values greater than 70.

Now, we will do the implementation part. We first import our data set of people who want to buy a specific product.

Dataset

We should follow the steps to build a Naive Bayes Algorithm.

Step 1. Import the Libraries

Import Libraries

Step 2. Importing the Dataset

Import Dataset

Step 3. Split the data into a matrix of features(X)(So we are taking ‘Age’ and ‘Salary’ into consideration to do Prediction) and the dependent variable(y).

Step 4. Splitting the matrix of features(X) and dependent variable(y) into training and test set.

Splitting of Test and Training set

Steps 5. Now we do Feature Scaling for ‘Age’ and ‘Salary’ column.

Code to apply feature scaling on the dataset

Step 6. Fitting a linear model to test and training dataset.

Step 7. Predicting the Test result.

Predict the model

Step 8. Making the Confusion Matrix to do predictions.

confusion matrix

Step 8. Visualization of Dataset.

--

--