A scalable and efficient probabilistic information retrieval and text mining system

Magnus Stensmo

Conference Proceedings

A scalable and efficient probabilistic information retrieval and text mining system

Stensmo M

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2002) 2415 LNCS 643-648

DOI: 10.1007/3-540-46084-5_105

1Citations

3Readers

Get full text

Abstract

A system for probabilistic information retrieval and text mining that is both scalable and efficient is presented. Separate feature extraction or stop-word lists are not needed since the system can remove unneeded parameters dynamically based on a local mutual information measure. This is shown to be as effective as using a global measure. A novel way ofstoring system parameters eliminates the need for a ranking step during information retrieval from queries. Probability models over word contexts provide a method to suggest related words that can be added to a query. Test results are presented on a categorization task and screen shots from a live system are shown to demonstrate its capabilities. © Springer-Verlag Berlin Heidelberg 2002.

Cite

CITATION STYLE

APA

Stensmo, M. (2002). A scalable and efficient probabilistic information retrieval and text mining system. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2415 LNCS, pp. 643–648). Springer Verlag. https://doi.org/10.1007/3-540-46084-5_105

A scalable and efficient probabilistic information retrieval and text mining system

Abstract

Cite

Register to see more suggestions