-1

the code is presented below

import re
line = "dogs are better than humans"
matchObj = re.match( r'(.*) are (.*?) .*', line)
if matchObj:
   print ("matchObj.group() : ", matchObj.group())
Michael Shopsin
  • 1,712
  • 2
  • 22
  • 37
Rohit Sah
  • 27
  • 6
  • 3
    Perhaps you can tell us what you think your regex does? There are a lot of tutorials on regex, but we can help clear up your misconceptions about what this expression does. Programming is a learning process and regex is complicated to understand. – Michael Shopsin Jul 30 '18 at 14:57
  • Maybe this shall help you answer me precisely. – Rohit Sah Jul 31 '18 at 06:15

1 Answers1

1
  • (.*): matches and captures any character (except new lines) any number of times. This may be zero times. . denotes "any character" and * signifies repetition. The parentheses are used to denote capture groups (explained below).

  • are: literal string " are "

  • (.*?): same as (.*) except it tries to match as few characters as possible (non-greedy). This means that it would try to stop matching as soon as possible. If your string contained multiple spaces after (.*?), this part of the expression would match all those spaces. Adding the non-greedy symbol (?) will make it stop at the first space (since that is the character after this segment of the expression).

  • .* any character any number of times.

Capture groups or captures for short are portions of the entire match. Wrapping an expression within your regex allows you to easily retrieve that portion of your match.

(dogs) are (better) than humans

(.*)   are  (.*?)     .*

In your example, dogs and better would be captured. These are also referred to as "groups". In regular expressions, they are marked by a pair of parentheses.

Play around with the regex here. Hover on the match to see which portions of the expression are captured.

shreyasminocha
  • 520
  • 3
  • 13