Skip to content

Regular Expressions

April 20, 2013

Since programmers and the software they write heavily use regular expressions to do their work, I found it very useful to learn how they actually work. Regular expressions operate by finding whether a given pattern exists within some corpus of text. In this respect, regular expressions are very much like substring searching. Interestingly, regular expressions also promise O(M*N) runtime in the worse case, just as substring search does for some pattern M. This is surprise because substring search looks for a particular string M, while regular expressions look for patterns that multiple strings can actually satisfy.

While substring searching depends on deterministic finite automata, regular expressions depend on nondeterministic finite automata. The relationship between DFA’s and Regular Expressions is defined in Kleene’s theorem which states: “DFAs and regular expressions give rise to exactly the same class of languages (the regular languages)”. The study of DFAs and NFAs provides great insight into why these problems can be solved without reading the same input more than once.

An interesting topic that I hadn’t realized until recently was that regular expressions (like hash functions) can be susceptible to an attack. Regular Expression DoS attack. If the implementation of RE forbids backtracking as a feature, then we have the O(M*N) guarantee on runtime. However, if that guarantee does not exist, then evil regular expressions can be exploited with the correct input. These attacks on RE implementations that allow backtracking can result in an exponential in N running time.

Finally, I’ve always used grep and it’s variants, but I never knew it was an acronym: Generalized Regular Expression Print.

Advertisements

From → Algorithms

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: