An NLP & IR Approach to Topic Detection

  • Chen H
  • Ku L
N/ACitations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents algorithms for Chinese and English-Chinese topic detection. Named entities, other nouns and verbs are cue patterns to relate news stories describing the same event. Lexical translation and name transliteration resolve lexical differences between English and Chinese. A two-threshold scheme determines relevance (irrelevance) between a news story and a topic cluster. Lookahead information deals with ambiguous cases in clustering. The least-recently-used removal strategy models the time factor in such a way that older and unimportant terms will have no effect on clustering. Experimental results show that nouns and verbs as well as the least-recently-used removal strategy outperform other models. The performance of the named-entity-only approach decreases slightly, but it has no overhead of nouns-and-verbs approach with the least-recently-used removal strategy.

Cite

CITATION STYLE

APA

Chen, H.-H., & Ku, L.-W. (2002). An NLP & IR Approach to Topic Detection (pp. 243–264). https://doi.org/10.1007/978-1-4615-0933-2_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free