ML: Logistic Regression
Logistic Regression
Logistic regression allows us to turn a linear combination of our input features into a probability using the logistic function:
Misnomer
Logistic regression is used to solve classification problems, not regression problems.
The logistic function \(g(z)=\frac{1}{1+e^{-z}}\) is frequently used to model binary outputs. Note that the output of the function is always between 0 and 1 , as seen in the following figure:
Intuitively, the logistic function models the probability of a data point belonging in class with label 1. The reason for that is that the output of the logistic function is bounded between 0 and 1, and we want our model to capture the probability of a feature having a specific label.
Multi-Class Logistic Regression
In multi-class logistic regression, we want to classify data points into K distinct categories. We use the softmax function in place of the logistic function, which models the probability of a new data point with features \(x\) having label \(i\) as follows:
We estimate the parameters \(\mathbf{w}\) to maximize the likelihood that we observed the data.