0

I want to write a regular expression that will captures all double quotes " in a string, except for those that are escaped.

For example, in the following String will return the first quote only:

"HELLO\"\"\"

but the following one will return 3 matches:

"HELLO\"\""\""

I have used the following expression, but since in JavaScript there is no negative lookbehind I am stuck:

(?<!\\)"

I have looked at similar questions but most provide a programmatic interface. I don't want to use a programmatic interface because I am using Ace editor and the simplest way to go around my problem is to define this regex.

I suppose there is no generic alternative, since I have tried the alternatives proposed to the similar questions, but non of them exactly matched my case.

Thanks for your answers!

user1485864
  • 409
  • 4
  • 16
  • 1
    Generally speaking, there is no equivalent for all cases - some can and some can't be emulated depending on the requirement. – nhahtdh Oct 24 '14 at 17:06
  • Yes, I also had this feeling and I guess my requirement represents a case that cannot be emulated. I am sure there is a proof somewhere explaining why lookbehind cannot be always emulated. – user1485864 Oct 25 '14 at 08:09
  • Possible duplicate of [Javascript: negative lookbehind equivalent?](http://stackoverflow.com/questions/641407/javascript-negative-lookbehind-equivalent) – Adam Katz Feb 09 '16 at 01:52

2 Answers2

1

You can use this workaround:

(^|[^\\])"

" only if preceded by any char but a \ or the beginning of the string (^).

But be careful, this matches two chars: the " AND the preceding character (unless in the start-of-the-string case). In other words, if you wan't to replace all these " by ' for example, you'll need:

theString.replace(/(^|[^\\])"/g, "$1'")
sp00m
  • 44,266
  • 23
  • 127
  • 230
  • Yes, this is also the first regex I came up with. But as you said this will not work because it will capture `""` in the String `\""`, whereas I only need to capture the later `"`. – user1485864 Oct 25 '14 at 08:07
0

The code I assume you are trying to run:

while ( matcher = /(?<!\\)"/g.exec(theString) ) {
    // do stuff. matcher[0] is double quotes (that don't follow a backslash)
}

In JavaScript, using this guide to JS lookbehinds:

while ( matcher = /(\\)?"/g.exec(theString) ) {
  if (!matcher[1]) {
    // do stuff.  matcher[0] is double quotes (that don't follow a backslash)
  }
}

This looks for double quotes (") that optionally follow a backslash (\) but then doesn't act when it actually does follow a backslash.

If you were merely trying to count the number of unescaped double-quotes, the "do stuff" line could be count++.

Adam Katz
  • 10,689
  • 2
  • 49
  • 68