0

I need help with a regex. I would like to find (to replace it) the value in tag "a" but only if it is between the "START" and "END" comments. So in this case it should match 3.3.0:

    <properties>
        <!-- START -->
        <a>3.3.0</a>
        <b>34</b>
        <!-- END  -->
        <c>11</c>
    </properties>

In this case it should NOT match:

<properties>
    <!-- START -->
    <b>34</b>
    <!-- END  -->
    <a>3.3.0</a>
    <c>11</c>
</properties>

Just an note: Finally a want to replace the value of a in the first case using python re.

Thanks, Alex

Alex
  • 19
  • 2

1 Answers1

0

As a general comment here, manipulating XML (or JSON) in this way using regex is not ideal. Instead, you might want to invest some time into learning how to parse XML. That being said, we could use regex here:

inp = """<properties>
<!-- START -->
<a>3.3.0</a>
<b>34</b>
<!-- END -->
<c>11</c>
</properties>"""
output = re.sub(r'(<!-- START -->(?:(?!<!-- END -->).)*)<a>.*?</a>(.*?<!-- END -->)', r'\1<a>something else</a>\2', inp, flags=re.DOTALL)
print(output)

This prints:

<properties>
<!-- START -->
<a>something else</a>
<b>34</b>
<!-- END -->
<c>11</c>
</properties>

The regex pattern uses a tempered dot to make sure that we can find the <a> tag without passing over the <!!-- END --> marker. So, it matches your first example, but not the second one.

Tim Biegeleisen
  • 387,723
  • 20
  • 200
  • 263
  • Thanks! I agree that xml parsing is a better way and I will do that but anyway I wanted to know the regex – Alex Mar 26 '21 at 12:22