1

How do we lookahead until there is no back reference of a character in RegEx?

Given:

We are looking for phrases within quotes and it can be multiline "check we have a return here but this line is still part of previous one 'a' string".

It breaks once we have another 'testing with single quotes "surrounding" double quotes';

How do we look for double quotes and single quotes once they close themselves?

I tried this pattern, but it's not working:

/(['"])[^$1]+\1/g

Look here

PRAISER
  • 739
  • 7
  • 14

1 Answers1

2

If your strings have no escape sequences, it is as easy as using a tempered greedy token like

/(['"])(?:(?!\1)[\s\S])+\1/g

See the regex demo. The (?:(?!\1)[\s\S])+ matches any symbol ([\s\S]) that is not the value captured into Group 1 (either ' or "). To also match "" or '', replace the + (1 or more occurrences) with * quantifier (0 or more occurrences).

If you may have escape sequences, you may use

/(['"])(?:\\[\s\S]|(?!\1)[^\\])*?\1/g

See this demo.

See the pattern details:

  • (['"]) - Group 1 capturing a ' or "
  • (?:\\[^]|(?!\1)[^\\])*? - 0+ (but as few as possible) occurrences of
    • \\[^] - any escape sequence
    • | - or
    • (?!\1)[^\\] - any char other than \ and the one captured into Group 1
  • \1 - the value kept in Group 1.

NOTE: [\s\S] in JS matches any char including line break chars. A JS only construct that matches all chars is [^] and is preferable from the performance point of view, but is not advised as it is not supported in other regex flavors (i.e. it is not portable).

Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397
  • The `[^\\]` is a negated character class that matches any char but ``\``. It is necessary to avoid matching it in the second alternative branch because the first one, `\\[\s\S]`, caters for that. A standalone ``\`` is not expected in a C string literal. – Wiktor Stribiżew Aug 22 '17 at 11:42