I have some text, which looks like this:
12 12 obj
<<
Some content here
>>
endobj
12 13 obj
<<
Some content here with a email address that contains @mail.
>>
endobj
11 12 obj
<<
Some more content here
>>
endobj
I want to remove any of the text blocks, starting with /d+ /d+ obj
(e.g. 12 13 obj
) to the point where they end at endobj
where they contain a specific string, which in this case, would be @mail
. I'm having some trouble finding the right RegEx for this though.
I'm able to successfully select each block with (\d+\ \d+\ obj[\s\S]+?endobj)
See test here: https://regex101.com/r/V4WAMl/5
But I am unable to get this to work as I want (\d+\ \d+\ obj[\s\S]+?@mail[\s\S]+?endobj)
See test here: https://regex101.com/r/V4WAMl/4
I have an idea as to why this is happening, but I'm not really sure how to work around it. My theory is that lazy modifier is being greedy because it does not match initially so it stops at the next one which does match. I've tried a combination of various excludes ^(?:*****)
, but those just seem to be not matching anything when I try.