Sign up & Download
Sign in

Papers in this group

1 - 20 of 90
  1. With today’s public data sets containing billions of data items, more and more companies are looking to integrate external data with their traditional enterprise data to improve business intelligence analysis. These distributed data sources however…
  2. This paper investigates entity linking over millions of high-precision extractions from a cor- pus of 500 million Web documents, toward the goal of creating a useful knowledge base of general facts. This paper is the first to report on entity…
  3. We demonstrate a prototype system that showcases the power of using a knowledge base (Probase) for search. The goal of Probase is to enable common sense computing, and its foundation is a universal, probabilistic ontology that is more comprehensive…
  4. Classically, training relation extractors relies on high-quality, manually annotated training data, which can be expensive to obtain. To mitigate this cost, NLU researchers have considered two newly available sources of less expensive (but…
  5. This paper contains a brief literature review about the semantic web, the semantic web application Freebase and ontology visualizations. Furthermore, it describes the development and theoretical concepts of InfoSpace3D1, a web application which…
  6. Schema summarization on large-scale databases is a challenge. In a typical large database schema, a great proportion of the tables are closely connected through a few high degree tables. It is thus difficult to separate these tables into clusters…
  7. This paper introduces the problem of searching for social network ac- counts, e.g., Twitter accounts, with the rich information available on the Web, e.g., people names, attributes, and relationships to other people. For this purpose, we need to…
  8. We present a novel approach to relation extraction that integrates information across documents, performs global inference and requires no labelled text. In particular, we tackle relation extraction and entity identification jointly. We use distant…
  9. The use of semantics for various search tasks and the topic known as semantic search formed around it have attracted attention from researchers in many different communities and stirred up commercial expectations and interests. Because this topic is…
  10. Wikis are established means for collaborative authoring, versioning and publishing of textual articles. The Wikipedia for example, succeeded in creating the by far largest encyclopedia just on the basis of a wiki. Wikis are created by wiki software…
  11. The Automatic Content Linking Device is a just-in-time document retrieval system that monitors an ongoing dialogue or monologue and enriches it with potentially related documents from local repositories or from theWeb. The documents are found using…
  12. This paper presents a novel ranking approach for complex semantic relationship (semantic association) search based on user preferences. We define a feature vector to describe various statistical and semantic features of a semantic association. Our…
  13. Comparing entities is an important part of decision making. Several approaches have been reported for mining comparable entities from Web sources to improve user experience in comparing entities online. However, these efforts extract only entities…
  14. Classifying blog posts by topics is useful for applications such as search and marketing. However, topic classification is time consuming and error prone, especially in an open domain such as the blogosphere. The state-of-the-art relies on…
  15. Novel research in the field of Linked Data focuses on the problem of entity summarization. This field addresses the problem of ranking features according to their importance for the task of identifying a particular entity. Next to a more human…
  16. We consider the problem of finding related tables in a large corpus of heterogenous tables. Detecting related tables provides users a powerful tool for enhancing their tables with additional data and enables effective reuse of available public data.…
  17. Freebase is a large-scale open-world database where users collaboratively create and structure content over an open platform. Keyword queries over Freebase are notoriously ambiguous due to the size and the complexity of the dataset. To this end,…
  18. Linked Data sources on the Web use a wide range of differ- ent vocabularies to represent data describing the same type of entity. For some types of entities, like people or biblio- graphic record, common vocabularies have emerged that are used by…