Coupled Bayesian sets algorithm for semi-supervised learning and information extraction

Saurabh Verma; Estevam R. Hruschka

Conference ProceedingsOPEN ACCESS

Coupled Bayesian sets algorithm for semi-supervised learning and information extraction

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7524 LNAI(PART 2) 307-322

DOI: 10.1007/978-3-642-33486-3_20

13Citations

19Readers

Abstract

Our inspiration comes from Nell (Never Ending Language Learning), a computer program running at Carnegie Mellon University to extract structured information from unstructured web pages. We consider the problem of semi-supervised learning approach to extract category instances (e.g. country(USA), city(New York)) from web pages, starting with a handful of labeled training examples of each category or relation, plus hundreds of millions of unlabeled web documents. Semi-supervised approaches using a small number of labeled examples together with many unlabeled examples are often unreliable as they frequently produce an internally consistent, but nevertheless, incorrect set of extractions. We believe that this problem can be overcome by simultaneously learning independent classifiers in a new approach named Coupled Bayesian Sets algorithm, based on Bayesian Sets, for many different categories and relations (in the presence of an ontology defining constraints that couple the training of these classifiers). Experimental results show that simultaneously learning a coupled collection of classifiers for random 11 categories resulted in much more accurate extractions than training classifiers through original Bayesian Sets algorithm, Naive Bayes, BaS-all and Coupled Pattern Learner (the category extractor used in NELL). © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Verma, S., & Hruschka, E. R. (2012). Coupled Bayesian sets algorithm for semi-supervised learning and information extraction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7524 LNAI, pp. 307–322). Springer Verlag. https://doi.org/10.1007/978-3-642-33486-3_20

Coupled Bayesian sets algorithm for semi-supervised learning and information extraction

Abstract

Author supplied keywords

Cite

Register to see more suggestions