1

I'm looking for a regex to parse everything but three consecutive double quotes. The problem is that when I use a normal negative lookahead the consecutive double quotes get gobbled and there it doesn't really match what I want.

Let's assume I have the following text:

Lorem Ipsum
"""
sdsdfgsdf
"""
bar

And want to linewise-regex to match the first, third and fifth row, but not the """.

I've tried the following regex: /(?!""").*/, but that's when the double quotes get gobbled. Trying to match one double quote at a time using ["] fails too: /(?!["]["]["]).*/

I'm using Python to match the regex.

Any ideas how I can make this regex work?

tuxtimo
  • 2,581
  • 16
  • 27

1 Answers1

3

The unanchored pattern (?!""").* will match any char 0+ times if what is on the right is not """. Since it is not anchored, it will match after the first " in """ because at that position the assertion will succeed.

You have to use an anchor ^ to assert the start of the string and add .* to the negative lookahead if those 3 double quotes can not occur in the string:

^(?!.*""").*$

Or only use the 3 consecutive quotes if those are the only chars in the string.

^(?!"""$).*$

Regex demo

The fourth bird
  • 96,715
  • 14
  • 35
  • 52