Text documents presenting a structured format allow the reader the ability to quickly run their eye over the page and read information relevant to them. By presenting information in this manner, the author allows ease of information extraction by the reader. If the structure used throughout the document involves a pattern or set of patterns to describe the text, then if text pre-processing methods can identify the patterns involved, those methods can also extract the same text as that of the naked eye. This extraction of meaningful text can then be used for further text mining applications. This paper describes a text pre-processing program that identifies text patterns and extracts the appropriate text. © Springer-Verlag Berlin Heidelberg 2003.
CITATION STYLE
Bogg, P. (2003). Pattern based approaches to pre-processing structured text: A newsfeed example. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2660, 859–867. https://doi.org/10.1007/3-540-44864-0_88
Mendeley helps you to discover research relevant for your work.