Combining co-training with ensemble learning for application on single-view natural language datasets

Jelena Slivka; Aleksandar Kovačević; Zora Konjović

Journal ArticleOPEN ACCESS

Combining co-training with ensemble learning for application on single-view natural language datasets

Acta Polytechnica Hungarica (2013) 10(2) 133-152

DOI: 10.12700/aph.10.02.2013.2.10

4Citations

9Readers

Abstract

In this paper we propose a novel semi-supervised learning algorithm, called Random Split Statistic algorithm (RSSalg), designed to exploit the advantages of co-training algorithm, while being exempt from co-training requirement for the existence of adequate feature split in the dataset. In our method, co-training algorithm is run for a predefined number of times, using a different random split of features in each run. Each run of co-training produces a different enlarged training set, consisting of initial labeled data and data labeled in the co-training process. Examples from the enlarged training sets are combined in a final training set and pruned in order to keep only the most confidently labeled ones. The final classifier in RSSalg is obtained by training the base learner on a set created this way. Pruning of the examples is done by employing a genetic algorithm that keeps only the most reliable and informative cases. Our experiments performed on 17 datasets with various characteristics show that RSSalg outperforms all considered alternative methods on the more redundant natural language datasets and is comparable to considered alternative settings on the datasets with less redundancy.

Author supplied keywords

Cite

CITATION STYLE

APA

Slivka, J., Kovačević, A., & Konjović, Z. (2013). Combining co-training with ensemble learning for application on single-view natural language datasets. Acta Polytechnica Hungarica, 10(2), 133–152. https://doi.org/10.12700/aph.10.02.2013.2.10

Combining co-training with ensemble learning for application on single-view natural language datasets

Abstract

Author supplied keywords

Cite

Register to see more suggestions