3

I'm using Python 2.6.9 for some regex, and I have the following string, where I'd like to match 111,111,111 and 222,222, but not the dollar amounts.

This is my current best attempt:

regexObj = re.compile(r'(?<!\$)\d{3}(?:,\d{3})*')
testStr1 = '111,111,111 and 222,222 but not $333,333,333 or $444,444'
regexObj.findall(testStr1)
['111,111,111', '222,222', '333,333', '444']

Can someone help out?

Thanks!

besslfcn
  • 33
  • 2

1 Answers1

3

Btam, please note that if you only want to match millions and billions as stated, there is a problem with your * quantifier. This will match what you want, subject to tuning for boundaries if you are able to specify some.

\b(?<![$,])\d{3}(?:,\d{3}){1,2}(?!,)

Note that the final quantifier is {1,2} instead of your original * because you said you want to match millions and billions. With a *, you could match thousands, trillions and zillions.

If you have more information about the boundaries (for instance, you are matching a whole string, or you always expect a space after the number), we can make the matching more precise, either by anchoring or by adding boundaries.

HamZa
  • 13,530
  • 11
  • 51
  • 70
zx81
  • 38,175
  • 8
  • 76
  • 97
  • @Btam, just added a fix: a negative lookahead after the {1,2} quantifier to make sure that we don't match a partial million that is part of a billion. You only want to match millions and billions, not thousands and zillions, right? – zx81 Apr 21 '14 at 19:47
  • Will probably mix and match, depending on the data. The regex tips and tricks are immensely useful by itself, thanks so much. – besslfcn Apr 21 '14 at 19:52
  • @BTam You are welcome. Hey, I notice that you haven't yet voted on StackOverflow. If this answer or another answer solves your problem, please consider "accepting it" by clicking the checkmark and arrow on the left, as this is how the reputation system works. Of course there is no obligation to do so. Later when you have more reputation you can also upvote questions. Thanks for listening to my 20-second SO tutorial. :) – zx81 Apr 21 '14 at 19:55
  • @Btam Yes regex is awesome. If you want to learn some more about regex on Stack, have you come across the [FAQ](http://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean/22944075#22944075) yet? – zx81 Apr 21 '14 at 20:00
  • @Btam Enjoy. Nice speaking to you. :) – zx81 Apr 21 '14 at 20:09