If you do not have to distinguish between INS
, RES
and CR
numbers, you can use this code:
import re
data = [ 'Ticket number INS 00909',
'Ticket number INS00909',
'Ticket number REQ 8776',
'Ticket number REQ#8777',
'Ticket number REQ #8778',
'Ticket number REQ8778',
'Number is CR 0098445554',
'No INS number',
'No REQUEST',
'DESCRIBED']
numbers = []
for item in data:
res = re.findall(r'[0-9]+$', item)
if res != []:
numbers.append(res[0])
print(numbers)
which gives a simple list with all the ticket numbers together:
['00909', '00909', '8776', '8777', '8778', '8778', '0098445554']
Otherwise, if you need to distinguish the ticket types, use this:
import re
data = [ 'Ticket number INS 00909',
'Ticket number INS00909',
'Ticket number REQ 8776',
'Ticket number REQ#8777',
'Ticket number REQ #8778',
'Ticket number REQ8778',
'Number is CR 0098445554',
'No INS number',
'No REQUEST',
'DESCRIBED']
numbers = {'INS': [], 'REQ': [], 'CR': []}
for item in data:
res = re.findall(r'[0-9]+$', item)
if res != []:
if 'INS' in item:
numbers['INS'].append(res[0])
elif 'REQ' in item:
numbers['REQ'].append(res[0])
elif 'CR' in item:
numbers['CR'].append(res[0])
print(numbers)
which gives a dictionary with a key for each ticket type:
{'INS': ['00909', '00909'], 'REQ': ['8776', '8777', '8778', '8778'], 'CR': ['0098445554']}
I kept the ticket numbers as str
and I had not converted them into int
in order to avoid this, perhaps unwanted, conversion: '00909' ---> 909
.
This solution is based on the assumption that the ticket number is always at the end of the string.