0
p = "\home\gef\Documents\abc_this_word_dfg.gz.tar"

I'm looking for a way to retrieve this_word.

base = os.path.basename(p)
base1 = base.replace("abc_","")
base1.replace("_dfg.gz.tar","")

this works, but it's not ideal because I would need to know in advance what strings I want to remove. Maybe a regex would be appropriate here?

HappyPy
  • 7,017
  • 10
  • 32
  • 56

2 Answers2

0

You don't give much information, but from what is shown can't you just use string slicing?

Maybe like this:

>>> p = os.path.join('home', 'gef', 'Documents', 'abc_this_word_dfg.gz.tar')
>>> p
'home/gef/Documents/abc_this_word_dfg.gz.tar'
>>> os.path.dirname(p)
'home/gef/Documents'
>>> os.path.basename(p)
'abc_this_word_dfg.gz.tar'
>>> os.path.basename(p)[4:-11]
'this_word'
Ralf
  • 13,322
  • 4
  • 31
  • 55
0

You don't give much information, but from what is shown can't you just split on _ chars?

Maybe like this:

>>> p = os.path.join('home', 'gef', 'Documents', 'abc_this_word_dfg.gz.tar')
>>> p
'home/gef/Documents/abc_this_word_dfg.gz.tar'
>>> os.path.dirname(p)
'home/gef/Documents'
>>> os.path.basename(p)
'abc_this_word_dfg.gz.tar'
>>> '_'.join(
...     os.path.basename(p).split('_')[1:-1])
'this_word'

It splits by underscores, then discards the first and last part, finally joining the other parts together with underscore (if this_word had no underscores, then there will be only one part left and no joining will be done).

Ralf
  • 13,322
  • 4
  • 31
  • 55