0

Looking to extract only 7 digit numbers from this list that starts with the distance matrix, nothing after the underscore

The list:

['data_train_3366094.dump','agile_234444.pkl','distanceMatrix_1517144.dump', 'distanceMatrix_3366094_1.dump']

expecting output: 1517144 , 3366094

Arjun
  • 39
  • 8
  • `(?<=distanceMatrix_)\d{7}\b`? Demo: https://regex101.com/r/jgySfR/1 – 41686d6564 Sep 14 '20 at 23:11
  • #half solution import re string = ['data_train_3366094.dump','agile_234444.pkl','distanceMatrix_1517144.dump', 'distanceMatrix_3366094_1.dump'] #search using regex for i in string: result = [i for i in string if i.startswith('distanceMatrix_')] #x = re.findall('[0-9]+', result) print(result) – Arjun Sep 14 '20 at 23:13

1 Answers1

0

My guess is to explode(), separating with _ and . Then match for numeric value

  • Unfortunately, we cannot do that as we are expecting to extract the numbers starting after distanceMatrix_ for analysis – Arjun Sep 14 '20 at 23:30