Topic detection in read documents

Rui Amaral; Isabel Trancoso

Conference Proceedings

Topic detection in read documents

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2000) 1923 315-318

DOI: 10.1007/3-540-45268-0_29

0Citations

7Readers

Get full text

Abstract

This paper addresses the problem of topic annotation in the speech retrieval domain. It describes an algorithm developed to perform automatic topic annotation of broadcast news (BN) speech corpora. The adopted approach is based in Hidden Markov Models (HMM) and topic language models, solving the topic segmentation and labelling tasks simultaneously. To overcome the lack of topic labelled material for training statistical models, a two-stage unsupervised clustering was developed. Both stages are based on the nearestneighbour search method, using the Kullback-Leibler distance. On-going experiments to evaluate the system performance are also described.

Cite

CITATION STYLE

APA

Amaral, R., & Trancoso, I. (2000). Topic detection in read documents. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1923, pp. 315–318). Springer Verlag. https://doi.org/10.1007/3-540-45268-0_29

Topic detection in read documents

Abstract

Cite

Register to see more suggestions