I would like to extract three columns from a line text using RegEx (Regular Expression) on Python. First and second columns are mandatorily available in the input but third column is optional. The values of first and third columns should be available in the list1 and list3, respectively.
I can solve the problem with if-else blocks, but I assume that it is not really efficient. Any thoughts about how this can be implemented under RegEx pattern preparations ? Thanks
What I have implemented is like:
List1 = ['temp1', 'another', 'etc']
List3 = ['tempX', 'anotherY', 'Z', 'Y']
line.strip()
list_of_list1 = list(filter(line.startswith, List1))
if len(list_of_list1) > 0:
sample1 = list_of_list1[0]
t = line[len(list_of_list1[0]):].strip()
list_of_list3 = list(filter(t.endswith, List3))
if len(list_of_list3) > 0:
...
Example inputs (line) and desired extractions (sample1, sample2, sample3)
"temp1 testsentencetempX "
sample1 = "temp1"
sample2 = "testsentence"
sample3 = "tempX"
" temp1 test Z "
sample1 = "temp1"
sample2 = "test"
sample3 = "Z"
"temp1 testyy"
sample1 = "temp1"
sample2 = "testyy"
sample3 = None