Abstract
AdaBoost is a method to create a final hypothesis by repeatedly generating a weak hypothesis in each training iteration with a given weak learner. AdaBoost-based algorithms are successfully applied to several tasks such as Natural Language Processing (NLP), OCR, and so on. However, learning on the training data consisting of large number of samples and features requires long training time. We propose a fast AdaBoost-based algorithm for learning rules represented by combination of features. Our algorithm constructs a final hypothesis by learning several weak-hypotheses at each iteration. We assign a confidence-rated value to each weak-hypothesis while ensuring a reduction in the theoretical upper bound of the training error of AdaBoost. We evaluate our methods with English POS tagging and text chunking. The experimental results show that the training speed of our algorithm are about 25 times faster than an AdaBoost-based learner, and about 50 times faster than Support Vector Machines with polynomial kernel on the average while maintaining state-of-the-art accuracy.
Author supplied keywords
Cite
CITATION STYLE
Iwakura, T., Okamoto, S., & Asakawa, K. (2010). An AdaBoost using a weak-learner generating several weak-hypotheses for large training data of natural language processing. IEEJ Transactions on Electronics, Information and Systems, 130(1), 83–91. https://doi.org/10.1541/ieejeiss.130.83
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.