0

So I'm trying to modify a RegEx to capture tags inside an HTML-Anchor element (I know you're not supposed to, but it is demanded). But a problem arose when I tried to to not match anything inbetween the capture-groups since the "anything" is taken literally and so the tags inside the element aren't captured. Until now I have tried a non-capturing group and a negated set but both seem to "swallow" my groups.

/<a[^>]*href=\"([^\"]+)\"(?:.*?)( data-survey=[\"\']({[^}]*})[\"\'])?( data-answer=[\"\']({[^}]*})[\"\'])?[^>]*\/?>/g

The (?:.*?) seems to be the culprit here. For example: <a href="#" foo data-survey="{...}">. As long as there isn't anything inbetween the <a and the data... it seems to be working.

THess
  • 1,001
  • 1
  • 9
  • 19
brknd
  • 3
  • 3

1 Answers1

0

Try replacing that problematic:

(?:.*?)

with:

(?:(?! data-(?:survey=|answer=)).)*

which says to keep matching the next character as long as the next characters are not ' data-survey=' or ' data-answer='.

See Regex Demo

Booboo
  • 18,421
  • 2
  • 23
  • 40
  • Had to change it to ```(?:(?! data-|>).)*``` so it doesn't match too much when it encounters only an ``````, but this worked. Thank you! – brknd Dec 02 '19 at 11:49
  • Changed it to make sure it a bit more robust. – Booboo Dec 02 '19 at 11:50
  • I think what you had originally would have worked if `data-survey="etc.` were not optional. – Booboo Dec 02 '19 at 11:53