I am parsing out zip codes from address strings which are stored in a pandas column using regex. The zipcodes are 5 digits, however, there are building/unit numbers that are also 5 digits. So, I'd like the last instance of the match/search.
Here's my code:
# Function to search Zipcode from Address
def zipregex(address):
zipre = re.search('(\d{5})([- ])?(\d{4})?', address)
if zipre:
print(address, zipre.groups())
# Function call
df['Zip'] = df.apply(lambda x: zipregex(x['Address']), axis=1)
returns,
642b N 17th Ave, Phoenix, AZ 85007, USA ('85007', None, None)
38956-38962 N New River Rd, Peoria, AZ 85383, USA ('38956', '-', '3896')
In the 2nd case, I need it to return 85383
and not 38956-38962
.