Pyramidal digest: An efficient model for abstracting text databases

Wesley T. Chuang; D. Stott Parker

Conference Proceedings

Pyramidal digest: An efficient model for abstracting text databases

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2001) 2113 360-369

DOI: 10.1007/3-540-44759-8_36

0Citations

10Readers

Get full text

Abstract

We present a novel model of automated composite text digest, the Pyramidal Digest. The model integrates traditional text summarization and text classification in that the digest not only serves as a "summary" but is also able to classify text segments of any given size, and answer queries relative to a context. "Pyramidal" refers to the fact that the digest is created in at least three dimensions: scope, granularity, and scale. The Pyramidal Digest is defined recursively as a structure of extracted and abstracted features that are obtained gradually — from specific to general, and from large to small text segment size — through a combination of shallow parsing and machine learning algorithms. There are three noticeable threads of learning taking place: learning of characteristic relations, rhetorical relations, and lexical relations. Our model provides a principle for efficiently digesting large quantities of text: progressive learning can digest text by abstracting its significant features. This approach scales, with complexity bounded by 0(n log n), where n is the size of the text. It offers a standard and systematic way of collecting as many semantic features as possible that are reachable by shallow parsing. It enables readers to query beyond keyword matches.

Cite

CITATION STYLE

APA

Chuang, W. T., & Parker, D. S. (2001). Pyramidal digest: An efficient model for abstracting text databases. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2113, pp. 360–369). Springer Verlag. https://doi.org/10.1007/3-540-44759-8_36

Pyramidal digest: An efficient model for abstracting text databases

Abstract

Cite

Register to see more suggestions