0

I want to get all the occurrences of the pattern '[number]' including their context but I can't.

Here is my code:

import re
text = 'some crap [00][0] some more'
regex = r'\[[0-9]*\]'
regex = '.{0,10}' + regex + '.{0,10}'
occurrences = re.findall(regex, text)
for occ in occurrences print(occ)

What is actually wrong!?

My code works just as I wish in any case except for when there are two [number] blocks with less than 10 characters in between. where my code gives me one result while I'm looking for two. If I set the regex to include the overlapping occurrences then it will give all the results for different context lengths. I can't set the context length specifically because I want to include the occurrences at the beginning and end of the string.

What I actually want:

I prefer a pure regex solution to get me all the occurrences of the mentioned pattern including their context.

If really impossible I'd do fine with a solution that uses the positions and selects a range from the string.

tgwtdt
  • 332
  • 2
  • 15
  • 1
    Can you please provide us with multiple inputs and their expected outputs so that we have a better understanding of what you are looking for? – ctwheels Nov 28 '17 at 16:42
  • 1
    [`(?:(?!\[\d*\]).){0,10}\[\d*\](?:(?!\[\d*\]).){0,10}`](https://regex101.com/r/axf8ub/1)? – ctwheels Nov 28 '17 at 16:46
  • @user if you mean '.{0,10}\[(\d+)\].{0,10}' no I'm not because that does not give be the context, that Is only giving me some numbers. – tgwtdt Nov 28 '17 at 16:49
  • @ctwheels that worked. thanks a lot:) you are a regex god! – tgwtdt Nov 28 '17 at 16:53

1 Answers1

1

Read about non-capturing group and negative lookahead.

To fix your issue just change the forth line to:

regex = '(?:(?!' + regex + ').){0,10}' + regex + '(?:(?!' + regex + ').){0,10}'
yukashima huksay
  • 3,952
  • 2
  • 30
  • 65