logistic regression

Logistic Regression

A supervised learning model that applies the sigmoid function to a linear model, producing a probability estimate:
$\overset{y}{^} = σ (w^{T} x + b) = \frac{1}{1 + e ^{- (w^{T} x + b)}}$
where $w \in R^{d}$ are weights, $b$ is a bias, and $\overset{y}{^} \in (0, 1)$ is interpreted as $P (y = 1∣ x)$ .

Logistic regression is not a classification algorithm on its own - the output is a probability, a real value in $[0, 1]$ . It becomes a classifier when combined with a decision rule, e.g., predict class 1 if $\overset{y}{^} > 0.5$ .

Why not just use linear regression for classification?

Linear regression outputs unbounded values - a point far from the decision boundary might get predicted as $y = 47$ , which is meaningless for class probabilities. The sigmoid squashes everything into $(0, 1)$ , giving proper probabilities that sum correctly.

The key difference is the nonlinearity: logistic regression is essentially the simplest possible neural network - a linear model followed by one nonlinear activation function. Stack more layers with nonlinearities and you get a multi-layer perceptron. Sigmoid is the classic choice, but other functions work too.

Optimization

Unlike linear regression, logistic regression has no closed-form solution. We can use gradient descent:
$w_{t + 1} = w_{t} - η \frac{\partial L}{\partial w}$
With cross-entropy loss, the problem is convex (single global minimum). MSE would make it non-convex: the sigmoid’s flat tails create near-zero gradients when predictions are confidently wrong, trapping gradient descent. Cross-entropy penalizes confident wrong predictions heavily, keeping gradients informative:
$L = - \frac{1}{n} i = 1 \sum n [y_{i} lo g (\overset{y}{^}_{i}) + (1 - y_{i}) lo g (1 - \overset{y}{^}_{i})]$

Multi-class extension

For $K > 2$ classes, replace sigmoid with softmax:
$P (y = k ∣ x) = \frac{e ^{w_{k}^{T} x}}{\sum _{j = 1}^{K} e ^{w_{j}^{T} x}}$
Each class gets its own weight vector. Softmax ensures probabilities sum to 1 across all classes.

Max Wolf's Second Brain

Explorer

logistic regression

Graph View

Backlinks