On the Use of Regular Expressions for Searching Text

33Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

Abstract

The use of regular expressions for text search is widely known and well understood. It is then surprising that the standard techniques and tools prove to be of limited use for searching structured text formatted with SGML or similar markup languages. Our experience with structured text search has caused us to reexamine the current practice. The generally accepted rule of "leftmost longest match" is an unfortunate choice and is at the root of the difficulties. We instead propose a rule which is semantically cleaner. This rule is generally applicable to a variety of text search applications, including source code analysis, and has interesting properties in its own right. We have written a publicly available search tool implementing the theory in the article, which has proved valuable in a variety of circumstances.

Cite

CITATION STYLE

APA

Clarke, C. L. A., & Cormack, G. V. (1997). On the Use of Regular Expressions for Searching Text. ACM Transactions on Programming Languages and Systems, 19(3), 413–426. https://doi.org/10.1145/256167.256174

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free