Logistic Regression
\[h(t)=\frac{e^t}{1+e^t}=\frac{1}{1+e^{-t}}\]use sigmod or softmax to approximate posterior probability
Binary classification
logistic function/sigmod
LR model
- sigmod: $\theta x \Rightarrow probability$
$P(y=1|x) = \frac{1}{1+e^{-\theta x}}$ $P(y=-1|x) = 1 - P(y=1|x) = \frac{1}{1+e^{\theta x}}$
$\Rightarrow$ 恰好可以合并成 $P(y|x) = \frac{1}{1+e^{-y\theta x}}$
- MLE
- likelihood = $\prod_{i=1}^n P(yi|xi)$
- 思考:为什么用MLE,概率乘积,而不直接max $\sum \theta x_i$ : 用乘积不会出现牺牲某个样本的概率特别小,要保证总体的预测概率都相对大,乘积才能大
- log
$\Rightarrow$
\[min \frac{1}{n}\sum_{i=1}^nlog(1+e^(-y_i\theta x-i))\]Multi-class LR
softmax
- softmax get probability
- MLE 注意I作为P的次数
- log