What is Logistic Regression?
sklearn.linear_model.LogisticRegression
Logistic Regression is a statistical model that uses a logistic function to model a binary dependent variable. Logistic Regression is basically used to solve the problems which are binary classified.
The Logistic Function, also known as a sigmoid function which helps to do the classification. The sigmoid function is basically an S-shape curve that takes any real value number and maps the value between 0 and 1.
So you might ask me a question that How this S curve is formed from linear regression? and Why is logistic regression considered a linear model?
So in, Simple Linear Regression we did the analysis that how we predict the salary of employees based on experience.
But, in the problems related to whether a person is buying a product or not? So in these types of classification problems you can, we cannot apply simple linear regression.
So if we analyze the above plot you can see that people after some age and before some age are not buying the product.
So we define a threshold that can visualize our plot in this way:
So if perform some mathematical substitution we can see that How to derive a sigmoid function formula?
So this mathematical substitution helps to do prediction which is based upon some prediction.
Now let's analyze how logistic regression classification is done?
So for that, we have to take 4 points 20,30, 40, and 50.
If we project our 4 points on the sigmoid function, then you can visualize it in this way :
Now if we put our values on this equation:
we can derive ‘p’ by putting the values of ‘x’ on the above equation.
Now you can see the value of ‘p’ which is basically the probability that our customer will buy that product or not.
Now, We do a simple comparison based on the probability that if the value of ‘p’ is less than 50% then he is not willing to buy that product otherwise he is willing to buy that product.
So, this is basically how sigmoid function works in Logistic Regression?.
Now we will do practically implementation of our model. We first import our data set of people who want to buy a specific product.
We should follow the steps to build a Logistic Regression model:
Step 1. Import the Libraries
Step 2. Importing the Dataset
Step 3. Split the data into a matrix of features(X)(So we are taking ‘Age’ and ‘Salary’ into consideration to do Prediction) and the dependent variable(y).
Step 4. Splitting the matrix of features(X) and dependent variable(y) into training and test set.
Steps 5. Now we do Feature Scaling for ‘Age’ and ‘Salary’ column.
Step 6. Fitting a linear model to test and training dataset.
Step 7. Predicting the Test result.
Step 8. Making the Confusion Matrix to do predictions.
Confusion Matrix helps to give the accuracy of our model that is the number of false values and true values.
Step 8. Visualization of Dataset.