My data look like this:
data = "2020-11-16 18:07:12,752 part1: 3, part2: [0.43732753 0.53800666 0.5945345 0.4680181 0.39867717 0.6513964
0.6692839 0.5011144 0.41480267 0.40932187 0.40816575 0.44242284
0.56950533 0.54023486 0.46301603 0.46460354], part3: 0.3253246169133328
2020-11-16 18:07:23,940 part1: 4, part2: [0.4273718 0.5393375 0.591234 0.46008328 0.3886507 0.658916
0.7164184 0.37173408 0.42199427 0.5302575 0.34260145 0.5678605
0.5731818 0.5455015 0.45556515 0.47291118], part3: 0.37686885359458105"
I want to extract everything after the time, namely after part1 to the end of part3. The desired output should look like:
output = "part1: 3, part2: [0.43732753 0.53800666 0.5945345 0.4680181 0.39867717 0.6513964
0.6692839 0.5011144 0.41480267 0.40932187 0.40816575 0.44242284
0.56950533 0.54023486 0.46301603 0.46460354], part3: 0.3253246169133328"
But the multiple line stuff is making everything break. My current code looks like this.
output = re.findall(r"part1:(.*)\d{4}-\d{2}", data,re.DOTALL)[0]
I tried all the methods I found including all in this post:
How do I match any character across multiple lines in a regular expression?
Namely, I tried replacing (.\*)
with ([\s\S]\*)
or (.|\n|\r\*)
or ((?s).\*)
and their combinations with the re.DOTALL
and re.MULTILINE
flags. None of them worked. Could anyone help me out?
update: I tried part1:((.\*)\n)\*?(.\*)part3(.\*)
and it worked on the finder in VSCode. But it doesn't work for python.