This is actually not that much of a problem with regex (assuming you can guarantee that <tr>
will not show up in comments, strings etc.; otherwise the regex will mis-match):
<tr\b(?:(?!</?tr\b).)*</tr>
will only match innermost tr
tags. Use the dot-matches-newlines
option of your regex engine, or it won't work correctly. If you don't have one (JavaScript, I'm talking to you!), then use [\s\S]
instead of the .
.
Explanation:
<tr\b # Match a tag that starts with tr
(?: # Match...
(?! # (unless it's possible to match
</?tr\b # <tr or </tr at the current position)
)
. # any character
)* # any number of times.
</tr> # Match </tr>