Name discrimination and E-mail clustering using unsupervised clustering of similar contexts

Anagha Kulkarni; Ted Pedersen

Journal ArticleOPEN ACCESS

Name discrimination and E-mail clustering using unsupervised clustering of similar contexts

Journal of Intelligent Systems (2008) 17(1-3) 37-50

DOI: 10.1515/JISYS.2008.17.1-3.37

3Citations

18Readers

Abstract

In this paper, we apply an unsupervised word-sense discrimination technique based on clustering similar contexts (Purandare & Pedersen, 2004) to the problems of name discrimination and e-mail clustering. Names of people, places, and organizations are not always unique. This can create a problem when we refer to or seek out information about such entities. When this occurs in written text, we show that we can cluster ambiguous names into unique groups by identifying which contexts are similar to each other. It has been previously shown by Pedersen et al. (2005) that this approach can be successfully used for discrimination of names with two-way ambiguity. Here we show that it can be extended to multi-way distinctions as well. On the similar lines of contextual similarity, we also observe that e-mail messages can be treated as contexts, and that in clustering them together we are able to group them based on their underlying topic.

Author supplied keywords

Cite

CITATION STYLE

APA

Kulkarni, A., & Pedersen, T. (2008). Name discrimination and E-mail clustering using unsupervised clustering of similar contexts. Journal of Intelligent Systems, 17(1–3), 37–50. https://doi.org/10.1515/JISYS.2008.17.1-3.37

Name discrimination and E-mail clustering using unsupervised clustering of similar contexts

Abstract

Author supplied keywords

Cite

Register to see more suggestions