4

I'm trying to dollar amounts between 10,000 to 150,000,000.

I got this from a stack user previously but only catches from 1,000,000 through 150,000,000

(?<!\d)(\d{1,3}(?:,\d{3}){2,})(?!\d)

I tried reworking it for the last hour but can't and regex is a notorious head wreck :D anyone can update it to start catching from 10,000? Thanks!

Digital Moniker
  • 209
  • 1
  • 9

4 Answers4

3

One approach might be to use the generic regex for thousands, and then add a lookahead to restrict the lengths to the range you want:

^(?=.{6,11}$)\d{1,3}(?:,\d{3})*$

Demo

Tim Biegeleisen
  • 387,723
  • 20
  • 200
  • 263
1

You can use

(?<!\d)(\d{1,3}(?:,\d{3})+)(?!\d)

See the regex demo.

Details:

  • (?<!\d) - no digit allowed immediately to the left of the current location
  • (\d{1,3}(?:,\d{3})+) - Group 1: one to three digits followed with one or more (due to + quantifier) occurrences of a comma and thee digits
  • (?!\d) - no digit allowed immediately to the right of the current location.
Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397
0

Just cast it and compare it programmatically:

import pandas as pd

dct = {"numbers": ["10", "100", "200", "5,000", "10,000", "15000", "some weird stuff", "160,000,000"]}


def tester(number):
    try:
        number = float(number.replace(",", ""))
        if 10 * 10 ** 3 <= number <= 150 * 10 ** 6:
            return True
    except:
        pass
    return False

df = pd.DataFrame(dct)
df["in_range"] = df["numbers"].apply(tester)
print(df)

This yields

            numbers  in_range
0                10     False
1               100     False
2               200     False
3             5,000     False
4            10,000      True
5             15000      True
6  some weird stuff     False
7       160,000,000     False
Jan
  • 38,539
  • 8
  • 41
  • 69
0

You current pattern could also possibly match 999,999,999,999 due to the repeating of 2 or more times for this part (?:,\d{3}){2,}

The pattern also uses only the \d which can match 0-9 and is not limited to 5 anywhere in the pattern.


Matching 3 digits after the comma, you could use use an alternation | to match the separate range parts:

(?<!\S)(?:[1-9]\d\d?,\d{3}|(?:[1-9]\d?|1[0-4]\d),\d{3},\d{3}|150,000,000)(?!\S)
  • (?<!\S) Assert whitespace boundary to the left
  • (?: Non capture group
    • [1-9]\d\d?,\d{3} Match range 10,000 - 999,999
    • | Or
    • (?: Non capture group
      • [1-9]\d? Match range 1 - 99
      • | Or
      • 1[0-4]\d Match digits 100 - 149
    • ) Close non capture group
    • ,\d{3},\d{3} Match range ,000,000 - ,999,999
    • | Or
    • 150,000,000 Match the max value
  • ) Close non capture group
  • (?!\S) Assert whitespace boundary to the right

Regex demo

The fourth bird
  • 96,715
  • 14
  • 35
  • 52