Sign up & Download
Sign in

Annotated Search: Indexing, searching and ranking within annotated Wikipedia information boxes

by Simon Stenström
Baseline (2011)

Abstract

The main focus of Natural Language Processing has been aimed to understanding texts better, but little work has been aimed toward finding good search results to a query, given annotated data. This is the problem I have focused on. This thesis discuss both how to index annotated data, in which cases a search engine over annotated data offer better search results than a regular full text search engine, how the ranking function differ between annotated data and unstructured data search and how to evaluate a annotated search engine. I created a search engine over the semantically annotated Wikipedia information boxes and a baseline full-text search system over the same data. The thesis show that with some simple work, a annotated search engine can improve the performance with between 17 and 27 percent compared to the baseline even on a diverse data collection such as the Wikipedia information boxes.

Cite this document (BETA)

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

1 Reader on Mendeley
by Discipline
 
by Academic Status
 
100% Student (Bachelor)
by Country
 
100% United States

Groups

Freebase