AdaBoost
AdaBoost, short for Adaptive Boosting, was formulated by Schapire. It is a salient meta-learning method, and can be used in conjunction with a lot of other learning algorithms to improve their performances. AdaBoosting is adaptive in the sense that subsequent classifiers built are tweaking in favor of those instances misclassified by previous classifiers. It has been proven, in theory, that AdaBoosting is robust to noise data; in particular, it is not susceptible to the overfitting problem inherent to a wide variety of learning algorithms.
Quinlan's C5.0 implements, as commonly believed, a certain proprietary version of AdaBoosting together with decision tree induction algorithm.
Generalised form of the algorithm
Given: <math>(x_{1},y_{1}),\ldots,(x_{m},y_{m})<math> where <math>x_{i} \in X,\, y_{i} \in Y = \{-1, +1\}<math>
Initialise <math>D_{1}(i) = 1/m<math>.
For <math>t = 1,\ldots,T<math>:
- Train base learner using distribution <math>D_{t}<math>.
- Get base classifier <math>h_{t} : X \to \mathbf{R}<math>.
- Choose <math>\alpha_{t} \in \mathbf{R}<math>.
- Update:
<math>D_{t+1}(i) = \frac{D_{t}(i)\, exp(-\alpha_{t} y_{i}h_{t}(x_{i}))}{Z_{t}}<math>
where <math>Z_{t}<math> is a normalisation factor (chosen so that <math>D_{t+1}<math> will be a distribution).
Output the final classifier:
<math>H(x) = \textrm{sign}\left( \sum_{t=1}^{T} \alpha_{t}h_{t}(x)\right)<math>
Categories: Classification | Machine learning