I'm new to learning regex, and I came across a problem that I solved, although I'm not sure why it was a problem and would just like to learn a bit more!
I'm using Python for my regex statement. The relevant portion of text to be captured is (I've changed the exact numbers, but this is what it looks like)
Evaluation Type: InterimContract Percent Complete: 30%Period of Performance Being Assessed: 05/27/2013 -
I'm looking to capture Interim
and 05/27/2013
. The regex that I was using that did NOT work was
match = re.search(
"Evaluation Type:[\s\n]*(.*?)[\s\n]*Contract Percent[.]*"
"Period of Performance Being Assessed:[\s\n]*(.*?)[\s\n]*-"
, page_content)
The code that does work is
match = re.search(
"Evaluation Type:[\s\n]*(.*?)[\s\n]*Contract Percent.*"
"Period of Performance Being Assessed:[\s\n]*(.*?)[\s\n]*-"
, page_content)
(as you may notice, the difference is that I removed the square brackets around the .
at the end of line 2.
I understand that the brackets weren't actually needed (just helped me visualize it as I'm creating the regex) but I'm not sure why they broke it. I was getting no match with the first set of code, while a perfect match with the second. I'm sure it's some simple little thing, but I couldn't find what would be breaking from my searches online (although it could be that I don't understand enough in depth to know what I'm looking for)