0

I have code that goes through a file to find dates but its not returning that it found a match to my regular expression.

CODE:

std::string s(line);
std::smatch m;
std::regex e("^[0,1]?\d{1}\/(([0-2]?\d{1})|([3][0,1]{1}))\/(([1]{1}[9]{1}[9]{1}\d{1})|([2-9]{1}\d{3}))$");   
std::cout << "Target sequence: " << s << std::endl;
std::cout << "Regular expression: ^[0,1]?\d{1}\/(([0-2]?\d{1})|([3][0,1]{1}))\/(([1]{1}[9]{1}[9]{1}\d{1})|([2-9]{1}\d{3}))$" << std::endl;
std::cout << "The following matches and submatches were found:" << std::endl;

while (std::regex_search(s, m, e)) {
    for (auto x : m) std::cout << x << " ";
    std::cout << std::endl;
    s = m.suffix().str();
}

OUTPUT:

Success
Target sequence: 12/28/2002     2   15   38   43   50
Regular expression: ^[0,1]?d{1}/(([0-2]?d{1})|([3][0,1]{1}))/(([1]{1}[9]{1}[9]{1
}d{1})|([2-9]{1}d{3}))$
The following matches and submatches were found:
Enter q to quit:

Is my regular expression not correct or is it something else?

nhahtdh
  • 52,949
  • 15
  • 113
  • 149
Dave Cribbs
  • 659
  • 1
  • 5
  • 15
  • This doesn't fix stuff, but just a simplification of your regex: `^[0,1]?\d\/(([0-2]?\d)|(3[0,1]))\/((199\d)|([2-9]\d{3}))$` – nhahtdh Oct 17 '14 at 02:13

1 Answers1

3

The cause is in your regex and in how you specify the string literal:

  • Before we make any fix to your regex, try printing out the string literal to the console:

    std::cout << "^[0,1]?\d{1}\/(([0-2]?\d{1})|([3][0,1]{1}))\/(([1]{1}[9]{1}[9]{1}\d{1})|([2-9]{1}\d{3}))$";
    

    You will see that \ are missing, and <regex> can't see them.

    To specify \ in a string, you need to escape it, \\.

    By the way, printing the string is one of the debugging step in languages where there is no dedicated RegExp literal and the construction must be done via string.

  • You are anchoring your search with ^ and $. It will only find a match if the date is on its own in a line, and there must not even be leading or trailing spaces.

  • You are having a lot of redundant syntax, e.g. [1]{1} or . A character class with single character (which is not a special character in regex) can be taken out, i.e. 1{1}. And {1} is always redundant, i.e. [1]{1} can be shortened to 1.

  • / doesn't need escaping, either in the string literal or regex.

  • Fixing the syntactic problems above and remove ^ and $:

    "[0,1]?\\d/(([0-2]?\\d)|(3[0,1]))/((199\\d)|([2-9]\\d{3}))"
    
  • By [0,1], you probably want [01]. When you want to match either character A or B, just place them next to each other in character class [AB]. Your [0,1] will also match a comma ,.

  • You can drop the () in ([0-2]?\\d), (3[0,1]). The same for year portion. The outer capturing group is enough.

  • Applying the 2 points above:

    "[01]?\\d/([0-2]?\\d|3[01])/(199\\d|[2-9]\\d{3})"
    

The regex should now work when you want to extract data, but not so nice if you want to use it to validate. I don't know why you restrict the year to 1990 to 9999, but it's probably your business logic.

nhahtdh
  • 52,949
  • 15
  • 113
  • 149
  • Thank you so much for your thorough explanation. I do not know how to write regular expressions, this is my first time using one and I literally know nothing about them. I just copied this expression from google. Do you know of any resources where I can learn about them. Also, I do not want it to be restricted to 1990 to 9999, how would I fix that? Again thank you. – Dave Cribbs Oct 17 '14 at 03:01
  • @DaveCribbs: What range do you want for the year? As for resources, in case of C++, check the ECMA/JavaScript RegExp (since regex in C++ is based on it) **first**, then check http://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean/22944075#22944075 if you need deeper explanation. – nhahtdh Oct 17 '14 at 03:04
  • Is there a way to make it accept any year or does it have to have a specification? – Dave Cribbs Oct 17 '14 at 03:09
  • @DaveCribbs: Depends on your application. You can make it `\d+` if you don't really care about the number of digits of the year. – nhahtdh Oct 17 '14 at 03:20