77

What is the complexity with respect to the string length that takes to perform a regular expression comparison on a string?

Ahmad Farid
  • 13,132
  • 45
  • 92
  • 134
  • 3
    The complexity depends more on the nature of the regex itself than on the length of the string. – LukeH Dec 07 '10 at 15:40
  • @LukeH Alternatively, it depends on the programming language used. For example, Python Regex can never exceed the computer power of a DFA, but Perl Regex can be Turing complete. – BlackVegetable Apr 30 '13 at 20:40
  • possible duplicate of [Complexity of Regex substitution](http://stackoverflow.com/questions/21669/complexity-of-regex-substitution) – Kevin Jul 18 '14 at 18:32

4 Answers4

75

The answer depends on what exactly you mean by "regular expressions." Classic regexes can be compiled into Deterministic Finite Automata that can match a string of length N in O(N) time. Certain extensions to the regex language change that for the worse.

You may find the following document of interest: Regular Expression Matching Can Be Simple And Fast.

NPE
  • 438,426
  • 93
  • 887
  • 970
  • I don't suppose it would be possible to get the test data used for that article? My work place uses perl regex's all the time. Were they really that slow, our hardware would fail completely. – DeepDeadpool May 20 '15 at 17:39
  • can you clarify exactly what you mean by "classic regexes"? – Varun Mathur Jan 25 '21 at 05:29
  • this is the execution time. What about compiling the regex? – ThisCompSciGuy Jan 27 '21 at 18:52
  • @VarunMathur A classic regex can be implemented entirely by concatenation, alternation, and the Kleene start (`*`). Operators like `+` are syntactic sugar, and do not increase the expressive power of a regular expression. Capture expressions are not part of a classic regular expression, as they allow you to match non-regular languages (a classic example being `(.*)x(\1)`, which does not even describe a context-free language). – chepner May 06 '21 at 18:40
9

unbounded - you can create a regular expression that never terminates, on an empty input string.

Alex Brown
  • 38,674
  • 9
  • 88
  • 106
  • 2
    Just out of curiosity, could you give an example Alex? –  Dec 07 '10 at 15:50
  • 5
    see man perlre - "'foo' =~ m{ ( o? )* }x;". Perl has special code to detect infinite recursion in this case and break out. – Alex Brown Dec 07 '10 at 16:09
6

If you use normal (TCS:no backreference, concatenation,alternation,Kleene star) regexp and regexp is already compiled then it's O(n).

royas
  • 4,466
  • 2
  • 14
  • 12
0

If you're looking for tight asymptotic bounds on RegEx (without respect to the expression itself), then there isn't one. As Alex points out, you can create a regex that is O(1) or a regex that is Omega(infinity). As a purely mathematical algorithm, a regular expression engine would be far too complicated to perform any sort of formal asymptotic analysis (aside from the fact that such analysis would be basically worthless).

The growth rate of a particular expression (since that, really, constitutes an algorithm, anyway) would be far more meaningful, though not necessarily any easier to analyze.

Adam Robinson
  • 171,726
  • 31
  • 271
  • 330
  • 1
    That's considering extensions of formal regular expressions. Regular expressions involving usual constructs (no look-ahead/backwards patterns for example) can be proven to always terminate on any input, in a O(length of input string) time. – Clément Apr 22 '13 at 17:11
  • @clement Even most extensions do not push the RE beyond a DFA. For instance, Python Regex can always be modeled by a DFA. However, as soon as you start working with Perl regex (and I believe Javascript?) it becomes a different animal that is equivalent to a TM instead. – BlackVegetable Apr 30 '13 at 19:39
  • Um, no. Complexity of a real regular expression is well defined. – Charlie Martin Jun 17 '19 at 18:39