Logistic Regression can be binary or multi-nomial depends on the no of levels hold by response variable. Here we will discuss Binary Logistic Regression but same things are applied for multi-nomial as well.
What is Binary Logistic Regression?
Binary Logistic regression means response variable will have two values (either 1 or 0). It is a special type of generalized linear models (GLMs) where:
- Random Component : refers to the probability distribution of the response variable (Y) which is binomial for Y in the case of binary logistic regression.
- Systematic Component : specifies the explanatory variables (X1, X2, … Xk) in the model which can be categorical or continous in case of logistic regression.
- Link Function: specifies the link between random and systematic components. For logistic regression it is logit() which we will discuss soon.
What are the scenarios where Logistic Regression is applicable?
- when we want to model the probabilities of a response variable as a function of some explanatory variables.
- when we want to predict and classify the probabilities that individuals fall into two categories of the binary response as a function of some explanatory variables.
- when response variable is categorical/mutually exclusive/dichotomous in nature.
Some Important terms w.r.t Logistic Regression:
- Probability – which will come out to be between 0 and 1
- Odds Ratio – Odds ratio for a variable in logistic regression represents how the odds change with a 1 unit increase in that variable keeping all other variable constant
- Logit – It is the natural log of odds, the link function for logistic regression:
Graph of logit(p):
In above graph, probabilities are plotted on x-axis and logit is on y-axis.
- Inverse Logit: In our logit function graph, 0 to 1 ran along x-axis but we want the probabilities to be on y-axis (as in reference with linear regression equation we put the response variable on y-axis). So, we can achieve that by taking inverse on both side of logit equation:
This some no will be a linear combination of variables and their coefficients. This inverse logit function will return the probability of occurrences of an event. The graph for inverse logit is :
Estimated Equation of Logistic Regression:
In Logistic Regression, what we do? we try to estimate the Probability or Odds of the response taking a particular value based on combination of values taken by one or more predictors (p hat).
The natural log of odds can be considered as equivalent to linear function of the independent variables. Taking the anti log of logit function will allow us to get our logistic regression equation: