-1

Can someone please explain why my compile doesn't match this string. I just get an empty list.

I tried using VERBOSE mode as well.

The first part (+45) is supposed to be optional (therefore the ?)

Then the next line can either be a dash or a space (i tried making a regular space, and that didnt work either. this is optional as well

Then the four digits

then another separator, also optional

Then the last 4 digits.

import re
b = re.compile(r"(\+45)?(-|\s)?\d\d\d\d(-|\s)?\d\d\d\d")
b.findall("+45 2222 2222 is")

1 Answers1

2

The main problem with your current approach has to do with that you have three capturing groups in your current regex pattern. re.findall behaves differently depending on the capture groups you have. Since you want to capture matches from your entire pattern, you shouldn't have any capturing groups. Two of the three groups aren't even needed, so I removed them below. For the optional country code, I turned off the capture group using ?:.

import re
b = re.compile(r'(?:\+45)?[ -]?\d{4}[ -]?\d{4}')
m = b.findall("+45 2222 2222 is")
print(m)

This prints:

['+45 2222 2222']

For completeness, here is an explanation of the updated version of the regex:

(?:\+45)?    match an optional leading '+45' country code
[ -]?        match either a space or a dash, zero or one time
\d{4}        match 4 digits
[ -]?        match another space or a dash, zero or one time
\d{4}        match 4 more digits

Note that [ -] is a character class, and represents one character which is either space or dash.

Tim Biegeleisen
  • 387,723
  • 20
  • 200
  • 263
  • But what do the brackets and colon do? I am doing the automate the boring stuff course on Udemy, and he did not mention anything about them. I also looked it up on google but nothing came up, except how you used backslash to escape it and make a literal bracket or colon. – Mads Lildholdt Hansen Mar 08 '20 at 13:01
  • 1
    @Mads I have added a more detailed explanation of how the regex is working. – Tim Biegeleisen Mar 08 '20 at 13:07