0

So I found a nice example of a regex which worked for my needs but a couple of weeks later I cannot find that source anymore and I forgot to document my code properly. This code works as intended for my needs, but can someone break it down for me into sections so I can understand what it is doing better in case I want to reuse it for something else? I am reading a column of notes left by a system in a csv file.

This is taking the phone number which is in a format of (999) 999-9999. But it also works, I have noticed, when the number is in the format of 10 digits 9999999999.

call['phone_number'] = call['activity'].str.extract('.*?(\(?\d{3}\D{0,3}\d{3}\D{0,3}\d{4}).*?')

This one is taking text between parenthesis like (Mobile), (Work), (Home). I noticed that if, for some reason, there is no text here then it defaults to the area code of the phone number (999). I am then replacing those incorrect fields with blanks.

call['call_type'] = call['activity'].str.extract('\((.*?)\)')
trench
  • 4,004
  • 9
  • 36
  • 69
  • 2
    1. Go to [regex101.com](http://regex101.com). 2. Paste the regex into the regex field. 3. Switch to Python on the left. 4. See the Explanation pane on the right. The point of interest for you here is certainly quantifiers `*?` and `{3}`/`{0,3}`. – Wiktor Stribiżew Jan 07 '16 at 11:53
  • Well that worked so quickly and easily. I guess I do not need to post any regex questions explanation questions moving forward. Thanks again. – trench Jan 07 '16 at 12:02
  • 1
    Well, it depends, if you have a rather complex pattern, and it gives you unexpected results, and you do not know how to fix it appropriately but tried something, you can post such questions freely. Just explanation can be obtained at the online regex testers like regex101.com, in Expresso (for .NET), [debuggex](https://www.debuggex.com/) (visual diagram), etc. – Wiktor Stribiżew Jan 07 '16 at 12:11

0 Answers0