0

I have a configuration file where users can provide regular expressions to match against words, e.g.

wordlist =  ["is", r"\b(and)\b"]

The problem is: if a user provides "is", this will also match against "This" -- which is not what I want. The second regex is better since it uses word-boundaries. Unfortunately, this is annoying to do for each word.

My idea is the following: let the user specify raw-strings (which are taken untouched for regex-matching) and "normal strings" (which are first translated to r"\b({})\b".format(word) for convenience). Is there a way to implement this? Can reflection be used to tell if a string was initially provided as raw-string?

Dimitris Fasarakis Hilliard
  • 119,766
  • 27
  • 228
  • 224
duesee
  • 101
  • 1
  • 9

1 Answers1

1

Can reflection be used to tell if a string was initially provided as raw-string

Unfortunately, no, because at runtime the raw string has already been evaluated, escape characters escaped and is simply a plain ol' string; there's no "raw string type" in Python, it's just on a syntactic level where you can make a distinction.

>>> type(r'\n') 
str
Dimitris Fasarakis Hilliard
  • 119,766
  • 27
  • 228
  • 224
  • I hoped that Python will provide any useful information from syntax-level via reflection. From my understanding it does it already, since there is `__name__`, etc. My next try would be to check for escape sequences and deduce if a regex or a "simple word" was meant -- but this feels unreliable. – duesee Feb 20 '17 at 12:01