1

Literally, I've been trying to a way to solve this but it seems that I'm poor on regex;)

I need to remove (WindowsPath and )"from the strings in a list

     x= ["(WindowsPath('D:/test/1_birds_bp.png'),WindowsPath('D:/test/1_eagle_mp.png'))", "(WindowsPath('D:/test/2_reptiles_bp.png'),WindowsPath('D:/test/2_crocodile_mp.png'))"]

So I tried

 import re
 cleaned_x = [re.sub("(?<=WindowsPath\(').*?(?='\))",'',a) for a in x]

outputs

["(WindowsPath(''),WindowsPath(''))", "(WindowsPath(''),WindowsPath(''))"]

what I need to have is;

cleaned_x= [('D:/test/1_birds_bp.png','D:/test/1_eagle_mp.png'), ('D:/test/2_reptiles_bp.png','D:/test/2_crocodile_mp.png')]

basically tuples in a list.

Alexander
  • 3,691
  • 5
  • 30
  • 66
  • Why use `re` when every string starts with `( WindowsPath (...))` wouldn't it be easy to use slicing? – Ch3steR Feb 21 '20 at 05:34
  • just curious - is there a need to use regexp's if you just want to remove them (and they are fixed substrings you want removed right)? couldn't you use standard string replace() function? – Richard Feb 21 '20 at 05:37
  • @Richard anything that gives the expected result would work for me;) – Alexander Feb 21 '20 at 05:38
  • ...actually I looked again, I think you do need to use regexp. ...hang on a sec :D – Richard Feb 21 '20 at 05:39
  • @Richard You guys are excellent help! Thanks for your time and efforts! This is a wonderful community! – Alexander Feb 21 '20 at 05:44

3 Answers3

2

You can accomplish this by using re.findall like this:

>>> cleaned_x = [tuple(re.findall(r"[A-Z]:/[^']+", a)) for a in x]
>>> cleaned_x
[('D:/test/1_birds_bp.png', 'D:/test/1_eagle_mp.png'), ('D:/test/2_reptiles_bp.png', 
'D:/test/2_crocodile_mp.png')]
>>> 

Hope it helps.

dcg
  • 3,856
  • 1
  • 15
  • 27
2

Perhaps you could use capturing groups? For instance:

import re

re_winpath = re.compile(r'^\(WindowsPath\(\'(.*)\'\)\,WindowsPath\(\'(.*)\'\)\)$')

def extract_pair(s):
    m = re_winpath.match(s)
    if m is None:
        raise ValueError(f"cannot extract pair from string: {s}")
    return m.groups()

pairs = list(map(extract_pair, x))
Kris
  • 16,165
  • 2
  • 25
  • 30
1

Here's my take,

not pretty, and I did it in two steps so as not to make regexp spagetti, and you could turn it into a list comprehension if you like, but it should work

for a in x:
    a = re.sub('(\()?WindowsPath', '', a)
    a = re.sub('\)$','', a)
    print(a)
Richard
  • 1,409
  • 1
  • 7
  • 25