0

I would like to extract three columns from a line text using RegEx (Regular Expression) on Python. First and second columns are mandatorily available in the input but third column is optional. The values of first and third columns should be available in the list1 and list3, respectively.

I can solve the problem with if-else blocks, but I assume that it is not really efficient. Any thoughts about how this can be implemented under RegEx pattern preparations ? Thanks

What I have implemented is like:

List1 = ['temp1', 'another', 'etc']
List3 = ['tempX', 'anotherY', 'Z', 'Y']

line.strip()
list_of_list1 = list(filter(line.startswith, List1))
if len(list_of_list1) > 0:
    sample1 = list_of_list1[0]
    t = line[len(list_of_list1[0]):].strip()
    list_of_list3 = list(filter(t.endswith, List3)) 
    if len(list_of_list3) > 0:
        ...

Example inputs (line) and desired extractions (sample1, sample2, sample3)

    "temp1 testsentencetempX  "
        sample1 = "temp1"
        sample2 = "testsentence"
        sample3 = "tempX"
    "  temp1 test Z  "
        sample1 = "temp1"
        sample2 = "test"
        sample3 = "Z"           
    "temp1 testyy"
        sample1 = "temp1"
        sample2 = "testyy"
        sample3 = None
ozturkib
  • 901
  • 9
  • 20
  • @MonkeyZeus Thanks for the comment. I can handle it with looping there is no issue. This can be handled with RegEx pattern as well without any looping. This is what I am asking really. – ozturkib Oct 28 '20 at 12:58
  • 1
    I do not think so it is an answered question, at all. I need your help as a community. The link is given is general RegEx reference page. With this logic, no question about regex should not be asked on stackoverflow. – ozturkib Oct 29 '20 at 07:55

0 Answers0