0

I'm new to regex and python in general. I encountered some problems while practicing with regex. I don't quite understand how does python search the patterns. So, I did write code and in my mind, it should work fine, but it's returning None.

So, the string must start with any character "\w" and then it it continues. The string must have any character that is not digit in the end, so when python starts searching, it should find "j" at first, then "o", then "e", then it encounters space and must stop since it is not a digit character "\D"$. But the code returns None. What am I doing wrong?

import re

string = "joe 10 15 20 30 40"

strRegex = re.compile(r"\w(.*)\D$")
mo = strRegex.search(string)
print(mo)

The second thing I'm struggling with is grouping. What I'm trying in the code below is to find all matches that start with a word character and ends with comma. So basically, output should be first and second sentence separately, but it's not working either.

string = "The generated Lorem Ipsum is therefore always free from repetition, injected humour,"

stringRegex = re.compile(r"(^\w.*,$)")
mo = stringRegex.findall(string)
print(mo)

Any help would be much appreciated.

Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397
  • `$` is end of line, of string, of content so here it doesn't work as your regex have not reached the end – azro May 06 '20 at 21:09
  • so, basically it searches the end of the entire string only? What I had in mind is that the python would have started the search from the start and would stop when it first encounters the \D$. Now it's clear, thanks. – Toma Margishvili May 06 '20 at 21:13
  • You should just go through basics. `\w+` matches 1 or more letters, digits or `_`. `^` matches the start of string, see [demo](https://regex101.com/r/PuciI8/1) getting you `Joe` from the first string. `$` matches end of string so you can't expect two matches in the second case. – Wiktor Stribiżew May 06 '20 at 21:15
  • It start at the beginning yes, but you regex required that the match end with a non-digit followed by end of string, as the last char is a digit you can't match – azro May 06 '20 at 21:15
  • For your second concern try: `stringRegex = re.compile(r"(\w.*),")`. The capture group starts with a word character, takes all the following characters, and ends before a comma. Primary change is leaving out ^ and $ and not having the comma as part of the capture group. – DarrylG May 06 '20 at 21:26
  • Thanks everyone for helping. – Toma Margishvili May 06 '20 at 22:12

0 Answers0