So I found a nice example of a regex which worked for my needs but a couple of weeks later I cannot find that source anymore and I forgot to document my code properly. This code works as intended for my needs, but can someone break it down for me into sections so I can understand what it is doing better in case I want to reuse it for something else? I am reading a column of notes left by a system in a csv file.
This is taking the phone number which is in a format of (999) 999-9999. But it also works, I have noticed, when the number is in the format of 10 digits 9999999999.
call['phone_number'] = call['activity'].str.extract('.*?(\(?\d{3}\D{0,3}\d{3}\D{0,3}\d{4}).*?')
This one is taking text between parenthesis like (Mobile), (Work), (Home). I noticed that if, for some reason, there is no text here then it defaults to the area code of the phone number (999). I am then replacing those incorrect fields with blanks.
call['call_type'] = call['activity'].str.extract('\((.*?)\)')