4

How can I modify the following line of code (which reads the names of parameters in config_file):

re.findall('Parameter.*', config_file)

so as to ignore lines containing a comment symbol (%) to the left? i.e. in the following example,

Parameter: A
%Parameter: B
  %  Parameter: C
 Parameter: D %the best parameter

only A and D match?

Demosthene
  • 329
  • 1
  • 2
  • 13

2 Answers2

3

Try this Regex:

(?:(?<=^)|(?<=\n))\s*Parameter.*

Click for Demo

Explanation:

  • (?:(?<=^)|(?<=\n)) - finds the position which is just preceded by a \n or start-of-the-string
  • \s* - matches 0+ occurrences of white-spaces
  • Parameter.* - matches Parameter followed by 0+ occurrences of any character(except newline characters)
Gurmanjot Singh
  • 8,936
  • 2
  • 17
  • 37
  • I was just about to leave a comment about the lack of explanation -- thank you for preempting me. Also, good catch that this pattern could match after a newline as well as the start of the string. – colopop Dec 21 '17 at 15:43
  • This will fail to match when input is `foo Patameter: M` but requirement is to ignore only `%` lines – anubhava Dec 21 '17 at 15:45
  • @anubhava Although that requirement was not mentioned by the OP, but I guess we can use [`(?:(?<=^)|(?<=\n))[ ]*(?!\s*?%).*?Parameter.*`](https://regex101.com/r/Ss7JFZ/2) for that case. – Gurmanjot Singh Dec 21 '17 at 15:57
  • Why so complicated, you can just use: `(?m)^[^%]*Parameter.*` – anubhava Dec 21 '17 at 16:28
  • @anubhava That certainly looks better. – Gurmanjot Singh Dec 21 '17 at 16:30
1

You can make use of regex alternation and capturing groups in findall:

>>> test_str = ("Parameter: A\n"
...     "%Parameter: B\n"
...     "  %  Parameter: C\n"
...     " Parameter: D %the best parameter")
>>>
>>> print filter(None, re.findall(r'%\s*Parameter|(Parameter.*)', test_str))
['Parameter: A', 'Parameter: D %the best parameter']

Matches that you want to discard should appear before the last capturing group match in an alternation.

RegEx Demo

anubhava
  • 664,788
  • 59
  • 469
  • 547