2

I'm attempting to insert a numeral between 2 regex groups; however, I can't figure out how to avoid referring to a different group number.

I'm attempting to use regex to update filenames in a directory. Essentially I have a season of a TV show, and all the filenames should follow the pattern "Show - S##E## - Episode Title"

I've written a simple loop to iterate over the files and set up the naming, but the issue I'm running into is that the episode number isn't set up as 2 digits in every file. I've included the loop I tried to use to fix this problem below.

I've tried to use re.sub() to identify the S##E as group 1, and the following digits as group 2, and then insert a '0' between the two groups, but I end up referencing group 10, which isn't defined. I'm not sure how to escape the group reference without referring to group 0 or inserting a backslash.

files = [f for f in os.listdir(os.path.abspath(os.curdir)) if os.path.isfile(f)]
for file in files:
    os.rename(file, re.sub(r'(S\d+E)(\d\s)',r'\10\2',file))

OR

files = [f for f in os.listdir(os.path.abspath(os.curdir)) if os.path.isfile(f)]
for file in files:
    os.rename(file, re.sub(r'(S\d+E)(\d\s),r'\1'+'0'+r'\2', file))

Intended results should be for all files to follow the S##E## pattern, even for episode numbers lower than 10. The first version results in an error as I am referring to a group that doesn't exist. The second does not appear to alter the filenames at all.

Dan
  • 53
  • 4
  • 3
    Can you provide an (simplified) expected input and output? – Stephen Cowley Dec 24 '18 at 20:04
  • I am iterating over an array of files. Each file begins with "Episode #" I'm trying to convert that to "Show - S##E##" and ensure that I use 2 digits to identify the season and episode in that pattern. – Dan Dec 24 '18 at 20:10

1 Answers1

0

There's a note about this in the docs for re.sub:

\g<number> uses the corresponding group number; \g<2> is therefore equivalent to \2, but isn’t ambiguous in a replacement such as \g<2>0. \20 would be interpreted as a reference to group 20, not a reference to group 2 followed by the literal character '0'.

So, write the group reference out in the more verbose way, so it's unambiguous:

os.rename(file, re.sub(r'(S\d+E)(\d\s)',r'\g<1>0\g<2>',file))
Blorgbeard
  • 93,378
  • 43
  • 217
  • 263
  • I read this page earlier today... Sorry for asking a question I should have known the answer to. Thank you for your help! – Dan Dec 24 '18 at 20:21
  • Happens to everyone. I actually went diving into the python source before I thought to check the docs myself :P – Blorgbeard Dec 24 '18 at 20:22