571

Possible Duplicate:
Is there a RegExp.escape function in Javascript?

I am trying to build a javascript regex based on user input:

function FindString(input) {
    var reg = new RegExp('' + input + '');
    // [snip] perform search
}

But the regex will not work correctly when the user input contains a ? or * because they are interpreted as regex specials. In fact, if the user puts an unbalanced ( or [ in their string, the regex isn't even valid.

What is the javascript function to correctly escape all special characters for use in regex?

Community
  • 1
  • 1
too much php
  • 81,874
  • 33
  • 123
  • 133

1 Answers1

1156

Short 'n Sweet

function escapeRegExp(string) {
  return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); // $& means the whole matched string
}

Example

escapeRegExp("All of these should be escaped: \ ^ $ * + ? . ( ) | { } [ ]");

>>> "All of these should be escaped: \\ \^ \$ \* \+ \? \. \( \) \| \{ \} \[ \] "

(NOTE: the above is not the original answer; it was edited to show the one from MDN. This means it does not match what you will find in the code in the below npm, and does not match what is shown in the below long answer. The comments are also now confusing. My recommendation: use the above, or get it from MDN, and ignore the rest of this answer. -Darren,Nov 2019)

Install

Available on npm as escape-string-regexp

npm install --save escape-string-regexp

Note

See MDN: Javascript Guide: Regular Expressions

Other symbols (~`!@# ...) MAY be escaped without consequence, but are not required to be.

.

.

.

.

Test Case: A typical url

escapeRegExp("/path/to/resource.html?search=query");

>>> "\/path\/to\/resource\.html\?search=query"

The Long Answer

If you're going to use the function above at least link to this stack overflow post in your code's documentation so that it doesn't look like crazy hard-to-test voodoo.

var escapeRegExp;

(function () {
  // Referring to the table here:
  // https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/regexp
  // these characters should be escaped
  // \ ^ $ * + ? . ( ) | { } [ ]
  // These characters only have special meaning inside of brackets
  // they do not need to be escaped, but they MAY be escaped
  // without any adverse effects (to the best of my knowledge and casual testing)
  // : ! , = 
  // my test "~!@#$%^&*(){}[]`/=?+\|-_;:'\",<.>".match(/[\#]/g)

  var specials = [
        // order matters for these
          "-"
        , "["
        , "]"
        // order doesn't matter for any of these
        , "/"
        , "{"
        , "}"
        , "("
        , ")"
        , "*"
        , "+"
        , "?"
        , "."
        , "\\"
        , "^"
        , "$"
        , "|"
      ]

      // I choose to escape every character with '\'
      // even though only some strictly require it when inside of []
    , regex = RegExp('[' + specials.join('\\') + ']', 'g')
    ;

  escapeRegExp = function (str) {
    return str.replace(regex, "\\$&");
  };

  // test escapeRegExp("/path/to/res?search=this.that")
}());
Darren Cook
  • 24,365
  • 12
  • 95
  • 193
coolaj86
  • 64,368
  • 14
  • 90
  • 108
  • 25
    Wow, that's verbose. I prefer [bobince's version](http://stackoverflow.com/a/3561711/157247). But anything that works without escaping things unnecessarily... – T.J. Crowder Jun 15 '12 at 15:50
  • 2
    I expect all of the characters that SHOULD be escaped, not just the ones that MUST be escaped, which is what linters such as JSLint undersand. – coolaj86 Oct 06 '12 at 17:37
  • If someone knows please: Why is `/` to be escaped ? Its not in the list of characters, but it is in the regex both in this example and in the included MDN page's regex ? – Eric Dec 05 '13 at 23:39
  • 1
    A literal regex is like `/blah/i`. A literal comment is `// blah`. So to prevent abiguity `///` becomes `/\//` and `/blah/i/` becomes `/blah\/i/`. Make sense? – coolaj86 Dec 06 '13 at 23:48
  • 8
    Why is it replaced by '\\$&'. What is that suppose to mean? I am sorry, I am JS newbie. – Sushant Gupta Jan 13 '14 at 15:38
  • 9
    @SushantGupta The "\\" adds the new backslash which escapes the matched special regex character. The "$&" is a back-reference to the contents of the current pattern match, adding the original special regex character. – danhbear Jan 14 '14 at 05:07
  • I think the ":" as e.g. used in "(?:x)" should also be escaped. The same applies for "=" as in "x(?=y)", the "!" as in "x(?!y)" and the "," as in "{n,m}". Or do you think, since the brackets are masked, these characters don't need to be masked??? – Philip Helger Jan 28 '15 at 10:48
  • ? and [ are escaped, which means that :, =, and ! have no special meaning. Check the comments in the code in the "Long Answer" block. – coolaj86 Jan 29 '15 at 07:49
  • you should also add the "m" for multiline – user467257 Jun 26 '15 at 08:15
  • 9
    Most of these characters don't need to be escaped within a character class. Dash and forward slash don't need to be escaped at all. So, this can be simplified as: return str.replace(/[[{}()*+?^$|\\]\.\\\]/g, "\\$&"); – richardtallent Sep 09 '15 at 20:03
  • I have a fairly complex regex (that needed to be broken down to 80 chars per line due to employer's coding conventions) and this version didn't work, but [bobince's version](http://stackoverflow.com/a/3561711/157247) did. – JKirchartz Oct 20 '15 at 13:23
  • @CoolAJ86 forward slashes shouldn't be escaped, they are not control characters in strings passed to `RegExp`. The only time you need to escape them is if you're generating JavaScript source code for use within a regex literal, which is a completely different problem and which your post doesn't even start to solve (for example, you don't escape newlines). – twhb Apr 17 '16 at 22:52
  • 11
    Is there a saner way in 2016? – rr- May 20 '16 at 22:36
  • The short one doesn't work for escaping "c++", which will be converted to c\\\\\\...(endless\) – dotslashlu Dec 19 '16 at 03:44
  • Nice answer. I looked for an online tool that does this but couldn't find one so have knocked up an online version at jsfiddle: https://jsfiddle.net/bwp2m5Lp/ – Steve Chambers Feb 20 '17 at 13:57
  • Due to the recent ESLint release (`no-useless-escape`) it fails with this RegExp now by default. – dude Sep 15 '17 at 08:06
  • 1
    Yes most of the escaped chars are redundant as eslint keeps telling me. I've tested and this would seem equivalent: `str.replace(/[-[\]/{}()*+?.\\^$|]/g, '\\$&');` – Richard Williams Dec 01 '17 at 12:16
  • 1
    Why is this hilariously wrong answer upvoted? It fails for even this simple regex: [\s\S]* – Kal_Torak Feb 21 '18 at 23:50
  • 2
    @Kal_Torak: var re = /[\s\S]*/; escapeRegExp(re.toString()); – coolaj86 Apr 04 '18 at 07:36
  • @Kal_Torak it does not fail, this is because you need to escape the escape in your string as shown by [@CoolAJ86](https://stackoverflow.com/users/151312/coolaj86) – Timo Huovinen Aug 07 '18 at 08:52
  • 1
    I just added a note to this answer to hopefully be less confusing; linting and checking is more strict in the latest browsers, and escaping `-` (as shown in the long answer, and the pre-2018 short answer) was causing runtime errors. Using the regex from MDN fixed it for me. – Darren Cook Nov 28 '19 at 11:27
  • @DarrenCook Assuming that [ is escaped I suppose it should be safe to not escape -. However, if you concatenate a string with - inside of a RegExp with [ before ], then you will in fact need it to be escaped. – coolaj86 Nov 29 '19 at 02:44
  • @CoolAJ86 Escaping something inside a character class (square brackets) makes no sense. If you are deliberately inserting end user text into just the character class part of a regex, you need to do something different: detect if any hyphens, and if so remove them all, then put the hyphen at the very start or end of the character class; an inserted close square bracket does need escaping, though. – Darren Cook Nov 29 '19 at 07:59