Note the start/end delimiters are in lookaround constructs in your pattern and thus will remain in the resulting string after re.sub
. You should convert the lookbehind and lookahead into consuming patterns.
Also, you seem to want to remove special chars after the right hand delimiter, so you need to add [^\w\s]*
at the end of the regex.
You may use
import re
text = """|start| this is first para to remove |end|.
this is another text.
|start| this is another para to remove |end|. Again some free text"""
print( re.sub(r'(?s)\|start\|.*?\|end\|[^\w\s]*', '', text).replace('\n', '') )
# => this is another text. Again some free text
See the Python demo.
Regex details
(?s)
- inline DOTALL modifier
\|start\|
- |start|
text
.*?
- any 0+ chars, as few as possible
\|end\|
- |end|
text
[^\w\s]*
- 0 or more chars other than word and whitespace chars.