Integrating geometrical and linguistic analysis for email signature block parsing

16Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

Abstract

The signature block is a common structured component found in email messages. Accurate identification and analysis of signature blocks is important in many multimedia messaging and information retrieval applications such as email text-to-speech rendering, automatic construction of personal address databases, and interactive message retrieval. It is also a very challenging task, because signature blocks often appear in complex two-dimensional layouts which are guided only by loose conventions. Traditional text analysis methods designed to deal with sequential text cannot handle two-dimensional structures, while the highly unconstrained nature of signature blocks makes the application of two-dimensional grammars very difficult. In this article, we describe an algorithm for signature block analysis which combines two-dimensional structural segmentation with one-dimensional grammatical constraints. The information obtained from both layout and linguistic analysis is integrated in the form of weighted finite-state transducers. The algorithm is currently implemented as a component in a preprocessing system for email text-to-speech rendering.

Cite

CITATION STYLE

APA

Chen, H., Hu, J., & Sproat, R. W. (1999). Integrating geometrical and linguistic analysis for email signature block parsing. ACM Transactions on Information Systems, 17(4), 343–366. https://doi.org/10.1145/326440.326442

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free