Modify string with `re.sub`

Question

Suppose a string:

s = 'F3·Compute·Introduction to Methematical Thinking.pdf'

I substitute F3·Compute· with '' using regex

In [23]: re.sub(r'F3?Compute?', '',s)
Out[23]: 'F3·Compute·Introduction to Methematical Thinking.pdf'

It failed to work as I intented

When tried,

In [21]: re.sub(r'F3·Compute·', '', 'F3·Compute·Introduction to Methematical Thinking.pdf')
Out[21]: 'Introduction to Methematical Thinking.pdf'

What's the problem with my regex pattern?

what's your desired output? just getting rid of F3.Compute.? — skrubber, Jan 02 '18 at 05:30
Use the dot, not the question mark. The dot means 'any char', the question mark means '1 or 0 of whatever char before the question mark' — Hai Vu, Jan 02 '18 at 05:32

Tim Biegeleisen · Answer 1 · 2018-01-02T05:40:57.330

Use dot to match any single character:

#coding: utf-8
import re

s = 'F3·Compute·Introduction to Methematical Thinking.pdf'
output = re.sub(r'F3.Compute.', '', unicode(s,"utf-8"), flags=re.U)
print output

Your original pattern, 'F3?Compute? was not having the desired effect. This said to match F followed by the number 3 optionally. Also, you made the final e of Compute optional. In any case, you were not accounting for the separator characters.

Note also that we must match on the unicode version of the string, and not the string directly. Without doing this, a dot won't match the unicode separator which you are trying to target. Have a look at the demo below for more information.

Demo

Downvoter: Care to leave a comment? When I answered this, the question was in good standing. — Tim Biegeleisen, Jan 02 '18 at 10:23

score -1 · Answer 2 · answered Jan 02 '18 at 05:32

The question mark ? does not stand in for a single character in regular expressions. It means 0 or 1 of the previous character, which in your case was 3 and e. Instead, the . is what you're looking for. It is a wildcard that stands for a single character (and has nothing to do with your middle-dot character; that is just coincidence).

re.sub(r'F3.Compute.', '',s)

Modify string with `re.sub`

2 Answers2

Demo