Finding “similar but different” documents based on coordinate relationship

1Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Traditional search technologies are based on similarity relationship such that they return content similar documents in accordance with a given one. However, such similarity-based search does not always result in good results, e.g., similar documents will bring little additional information so that it is difficult to increase information gain. In this paper, we propose a method to find similar but different documents of a user-given one by distinguishing coordinate relationship from similarity relationship between documents. Simply, a similar but different document denotes the document with the same topic as that of the given document, but describing different events or concepts. For example, given as the input a news article stating the occurrence of the Oregon school shooting, articles stating the occurrence of other school shooting events, such as the Virginia Tech shooting, are detected and returned to users. Experiments conducted on the New York Times Annotated Corpus verify the effectiveness of our method and illustrate the importance of incorporating coordinate relationship to find similar but different documents.

Cite

CITATION STYLE

APA

Zhao, M., Ohshima, H., & Tanaka, K. (2016). Finding “similar but different” documents based on coordinate relationship. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10075 LNCS, pp. 110–123). Springer Verlag. https://doi.org/10.1007/978-3-319-49304-6_15

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free