This paper illustrates how a combination of information retrieval, machine learning, and NLP corpus annotation techniques was applied to a problem of text content reliability estimation in Web documents. Our proposal for text content reliability estimation is based on a model in which reliability is a similarity measure between the content of the documents and a knowledge corpus. The proposal includes a new representation of text which uses entailment-based graphs. Then we use the graph-based representations as training instances for a machine learning algorithm allowing to build a reliability model. Experimental results illustrate the feasibility of our proposal by performing a comparison with a state-of-the-art method. © 2012 Springer-Verlag.
CITATION STYLE
Sanz, L., Allende, H., & Mendoza, M. (2012). Text content reliability estimation in web documents: A new proposal. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7182 LNCS, pp. 438–449). https://doi.org/10.1007/978-3-642-28601-8_37
Mendeley helps you to discover research relevant for your work.