0

I am new to regex and I am struggling with some pattern recognizing.

I have the following string for example. See that february 23 is repeated twice (in my real data it could be repeated more than 2 times), the same things happens for march 15.

st = "This happened on february 23. In that date, february 23, a lot of things happened until march 15. In march 15, nothing new happened."

Let's say I try

import re
re.findall('february 23.* march 15',st,re.I)

Then what I find is

['february 23. In that date, february 23, a lot of things happened until march 15. In march 15']

I would like to find the shortest string. In this case it would be: "february 23, a lot of things happened until march 15"

I have read here on stackoverflow a similar question and someone reccomended to use "nongreedy matching regex". So I tried to imitate their code and here is what I wrote:

re.findall('february 23.*? march 15',st,re.I)

And the result was

['february 23. In that date, february 23, a lot of things happened until march 15']

Much more better, since we don't have march 15 repeated twice, but february is still repeated twice. How can I do to match the string: "february 23, a lot of things happened until march 15"

Note that I need to find the shortest string that is between february 23 and march 15. And this is needed to be case unsensitive. Is there any smart way to do this using regex or not? I would love to know how to this using regex but if that solution is not possible I am open to other ideas to solve this.

Thank you so much in advance.

Tom
  • 193
  • 1
  • 10
  • 2
    In short, you might use `'february 23(?:(?!february 23).)*? march 15'` – Wiktor Stribiżew Apr 28 '20 at 18:55
  • Thank you very much. Do you have any book or reference for learning how this regex works? – Tom Apr 28 '20 at 19:03
  • Yes, see [my answer here](https://stackoverflow.com/questions/30900794/tempered-greedy-token-what-is-different-about-placing-the-dot-before-the-negat). – Wiktor Stribiżew Apr 28 '20 at 19:05
  • It seems like I have to learn a lot of things before understanding your answer. It will take time, but will worth it. Thank you – Tom Apr 28 '20 at 19:13

0 Answers0