Just use this function to make sure that all special characters are quoted and treated as literal character in regex:
function escapeRegex(input) {
return input.replace(/[[\](){}?*+^$\\.|]/g, '\\$&');
}
The function expect a String as input and output a String with all the special characters escaped. This is meant to create a String that can be fed to the RegExp
constructor to create a regex that matches the original string. Regarding whether the output of this method can be concatenated safely, check my additional note below.
List of all special characters in JS regex on MDN.
Nothing much to say about these ^
, $
, .
, |
, *
, ?
, +
.
This also effectively disable the special meaning of ^
inside []
if the first character, and ?
inside ()
if the first character.
The same for ?
and the lazy matching behavior when it follows a quantifier.
-
is only meaningful inside []
- but not any more when [
, ]
are escaped.
There might be problem if the template string is "[" + input + "]"
. I don't emulate the behavior of \Q
and \E
inside character class here, but you can add -
to the regex in the function above if you want to.
\
followed by some special sequence will lose its meaning when \
is escaped.
On a related note, the case that my method above fails is when the template string is "\\" + input
. However, I would say the fault lies on whoever wrote the template string, since this is total non-sense.
:
, =
, !
are only meaningful inside ()
(for non-capturing group and look-ahead) and must follow after ?
, but also lost its meaning when (
and )
are escaped. The ?
is already escaped so it poses no problem when the escaped string is inserted in between ()
.
Without escaping those, the method above fails when the template string is "(?" + input + ")"
. I again blame whoever who write this, since they are the one allowing the injection.
,
is only meaningful inside {}
, but lost its meaning when {
and }
are escaped.
The case the escaping fails is when you have the template string (e.g. to match a initializer) "\\w+ = {" + input + "}"
, but normally, one will escape {
and }
in the template string if the intention is to match them as literal characters.
There is also the case of repetition, but then, the template string should be ".{" + start + "," + end + "}
, and the input must be sanitized first.
In summary, the meta-characters in the template string must be properly escaped for any escaping function to work. If the escaped string is to be used in a character class, add -
to the character class.