Not understanding regex (1[0-2]|0?[1-9]) for validating time strings

Question

I found and tested a regex to validate a time string such as 11:30 AM:

^(1[0-2]|0?[1-9]):([0-5][0-9])(\s[A|P]M)\)?$

I understand most of it except the beginning:

(1[0-2]|0?[1-9])

Can someone explain what is going on? 1[0-2] is there is a fist digit that can be between 0 and 2? And then I don't understand |0?[1-9].

@Xuflx, what is your rationale for suggesting this may be a dup of the question you referenced. That question seems to just deal with regular expressions in a general way. — Cary Swoveland, Oct 26 '15 at 03:11
@CarySwoveland: That is a question for closing dup all these "explain me this regex". All these questions are useless - just take a look at the reference questions, or use an online tester. — nhahtdh, Oct 26 '15 at 03:30
Readers: this question was previously closed (by Xufix and nhahtdh) as a dup (of the same question now referenced by Jerry) and then reopened. If you do not regard it as a dup, please vote to re-open. (I cannot--the SO software says I've already voted to reopen, which I find a bit odd.) — Cary Swoveland, Oct 26 '15 at 11:19
@nhahtdh from that "Reference" question: "regex is suffering from give me ze code type of questions and poor answers with no explanation. This reference is meant to provide links to quality Q&A." Neither "give me ze code" nor "poor answers" are the case here. It's a normal question with good answers, so why the need to close? — Mischa, Oct 26 '15 at 12:00
(context: this popped in my review queue, and I abstained) Yes, this question is well formed, and a definitive answer exists. However, I believe such questions are unlikely to be helpful in general: the question is hard to search for (who would search for `"1[0-2]"`) and the right answer would be very specific (something along the lines of the output of https://regex101.com/, as suggested in one of the comments in the "reference" question. — RandomSeed, Oct 26 '15 at 12:30
@RandomSeed, it's not hard to search for. You'll find it by searching for "regex to validate a time", which I can imagine to be a quite common query. The reference question however will not be in the search results for that query. — Mischa, Oct 26 '15 at 13:26
Aaron, I suggest you remove "(1[0-2]|0?[1-9])" from the title. It adds nothing, is a distraction and may suggest to some that your question probably does not have merit. — Cary Swoveland, Oct 27 '15 at 05:22

score 3 · Accepted Answer · answered Oct 26 '15 at 02:27

3

(1[0-2]|0?[1-9])

| separates the regex into two parts, where

1[0-2]

matches 10, 11 or 12, and

0?[1-9]

matches 1 to 9, with an optional leading 0.

answered Oct 26 '15 at 02:27

Yu Hao

111,229
40
211
267

Cary Swoveland · Answer 2 · 2015-10-27T06:00:27.477

I will explain by writing the regex in extended mode, which permits comments:

r = /
    ^     # match the beginning of the string
    (     # begin capture group 1
    1     # match 1
    [0-2] # match one of the characters 0,1,2
    |     # or
    0?    # optionally match a zero
    [1-9] # match one of the characters between 1 and 9
    )     # end capture group 1
    :     # match a colon
    (     # begin capture group 2
    [0-5] # match one of the characters between 0 and 5
    [0-9] # match one of the characters between 0 and 9
    )     # end capture group 2
    (     # begin capture group 3
    \s    # match one whitespace character
    [A|P] # match one of the characters A, | or P
    M     # match M
    )     # end capture group 3
    \)?   # optionally match a right parenthesis
    $     # match the end of the string
    /x    # extended mode

As noticed by @Mischa, [A|P] is incorrect. It should be [AP]. That's because "|" is just an ordinary character when it's within a character class.

Also, I think the regex would be improved by moving \s out of capture group 3. We therefore might write:

r = /^(1[0-2]|0?[1-9]):([0-5][0-9])\s([AP]M)\)?$/

It could be used thusly:

result = "11:39 PM" =~ r
if result
  puts "It's #{$2} minutes past #{$1}, #{ $3=='AM' ? 'anti' : 'post' } meridiem."
else
  # raise exception?
end
  #=> It's 39 minutes past 11, post meridiem.

In words, the revised regex reads as follows:

match the beginning of the string.
match "10", "11", "12", or one of the digits "1" to "9", optionally preceded by a zero, and save the match to capture group 1.
match a colon.
match a digit between "0" and "5", then a digit between "0" and "9", and save the two digits to capture group 2.
match a whitespace character.
match "A", or "P", followed by "M", and save the two characters to capture group 3.
optionally match a right parenthesis.
match the end of the string.

`[A|P]` simply means `A` or `P`, right? Not "match one of the characters A, | or P" — Mischa, Oct 26 '15 at 02:37
@Mischa, in a character class, no. It is just the character `|`. Only outside a character class does it mean "or". — Cary Swoveland, Oct 26 '15 at 02:39
So this regex is wrong, because I assume they want `A` or `P`. How would you write that correctly? Simply `[AP]`? — Mischa, Oct 26 '15 at 02:59
@Mischa, if you are correct, yes, that would be just `[AP]`, but I don't see how you can draw that conclusion. — Cary Swoveland, Oct 26 '15 at 03:01
Are you kidding? If they are trying to match a time, why would they want to match something like `10:30 |M`? — Mischa, Oct 26 '15 at 03:12
@Mischa, ha! I didn't look at the context until after I had completed my answer. You are correct, of course! I finally clued in and left a comment on the question just before reading your comment. — Cary Swoveland, Oct 26 '15 at 03:19

Not understanding regex (1[0-2]|0?[1-9]) for validating time strings

2 Answers2