[原创]支持向量机(SVM)的推导及求解

(一)推导
SVM的原问题为:对于给定的线性可分的训练样本集:
\[{\rm{S}} = \{ ({x_1},{y_1}),({x_2},{y_2}),...,({x_l},{y_l})\} \]
我们的目标是寻找适当的的参数\[({w^*},{b^*})\],使得
 
\[\min imis{e_{w,b}}\quad \left\langle {w \cdot w} \right\rangle \]
\[subject\to\quad {y_i}\left( {\left\langle {w \cdot {x_i}} \right\rangle  + b} \right) \ge 1\quad i = 1,2,...,l\]
 
相应地,其拉格朗日函数为:
 
\[L(w,b,\alpha ) = \frac{1}{2}\left\langle {w \cdot w} \right\rangle  - \sum\limits_{i = 1}^l {{\alpha _i}[{y_i}\left( {\left\langle {w \cdot {x_i}} \right\rangle  + b} \right) - 1]} \]
分别对\[w\]\[b\]求偏导:
\[\frac{{\partial L(w,b,a)}}{{\partial w}} = w - \sum\limits_{i = 1}^l {{y_i}{\alpha _i}{x_i}}  = 0\]
\[\frac{{\partial L(w,b,a)}}{{\partial b}} = \sum\limits_{i = 1}^l {{y_i}{\alpha _i}}  = 0\]
带回原拉格朗日函数:
\[\begin{array}{l}  L(w,b,\alpha ) = \frac{1}{2}\left\langle {w\cdotw} \right\rangle  - \sum\limits_{i = 1}^l {{\alpha _i}[{y_i}\left( {\left\langle {w\cdot{x_i}} \right\rangle  + b} \right) - 1]}  \\   \quad \quad \quad \quad  = \frac{1}{2}\sum\limits_{i,j = 1}^l {{y_i}{y_j}{\alpha _i}{\alpha _j}\left\langle {{x_i},{x_j}} \right\rangle  - } \sum\limits_{i,j = 1}^l {{y_i}{y_j}{\alpha _i}{\alpha _j}\left\langle {{x_i},{x_j}} \right\rangle }  + \sum\limits_{i = 1}^l {{\alpha _i}}  \\   \quad \quad \quad \quad  = \sum\limits_{i = 1}^l {{\alpha _i}}  - \frac{1}{2}\sum\limits_{i,j = 1}^l {{y_i}{y_j}{\alpha _i}{\alpha _j}\left\langle {{x_i},{x_j}} \right\rangle }  \\   \end{array}\]
 
根据KKT互补条件,最优解\[{\alpha ^*}\]\[({w^*},{b^*})\]必须满足:\[\alpha _i^*[{y_i}\left( {\left\langle {{w^*}\cdot{x_i}} \right\rangle  + {b^*}} \right) - 1] = 0\],这意味着,只有支持平面上的点对应的\[{\alpha ^*}\]非零,其余的点对应的\[{\alpha ^*}\]为零。这也是支持向量机得名的原因,也为其快速求解铺垫了条件。
 
(二)求解