George, resurrecting this ancient question because it had a simple solution that wasn't mentioned. This situation is straight out of my pet question of the moment, Match (or replace) a pattern except in situations s1, s2, s3 etc
You want to modify the following regex to exclude anything between <script>
and </script>
:
(\bSOMETERM|SOMETERM\b)(?!([^<]+)?>)
Please forgive me for switching out $term
with SOMETERM
, it is for clarity because $
has a special meaning in regex.
With all the disclaimers about matching html in regex, to exclude anything between <script>
and </script>
, you can simply add this to the beginning of your regex:
<script>.*?</script>(*SKIP)(*F)|
so the regex becomes:
<script>.*?</script>(*SKIP)(*F)|(\bSOMETERM|SOMETERM\b)(?!([^<]+)?>)
How does this work?
The left side of the OR (i.e., |
) matches complete <script...</script>
tags, then deliberately fails. The right side matches what you were matching before, and we know it is the right stuff because if it was between script tags, it would have failed.
Reference
How to match (or replace) a pattern except in situations s1, s2, s3...