1
octet = /\d{,2}|1\d{2}|2[0-4]\d|25[0-5]/
ip_regex = /^#{octet}\.#{octet}\.#{octet}\.#{octet}/

The regex above is used to match an IP address. I understand that \d is used to match a digit, and I also understand the ip_regex part, but after looking at some tutorials I'm still not able to completely understand the octet part. Could someone enlighten me? What does {,2}|1 mean for example?

Jonathan S
  • 476
  • 3
  • 10
  • https://regex101.com – halfelf Apr 13 '18 at 03:28
  • One could write `ip_regex = /^(?:#{octet}\.){3}#{octet}/`. Note `^` is the start-of-line anchor. If the start-of-string anchor is wanted use `\A` instead. – Cary Swoveland Apr 13 '18 at 05:31
  • Easier would be `arr = str.split('.'); arr.size == 4 && arr.all? { |s| s =~ /\A\d+\z/ && s.to_i <= 255 }`. Better yet is `require 'ipaddr'; IPAddr.new(str).ipv4?`. If `str` is not a valid string representation of an IP address a syntax error is raised by `IPAddr.new(str)` (which have to be handled). See [IPAddr](http://ruby-doc.org/stdlib-1.9.3/libdoc/ipaddr/rdoc/IPAddr.html). – Cary Swoveland Apr 13 '18 at 06:13
  • The actual octet bit should be `octet = '(?:\d{1,2}|1\d{2}|2[0-4]\d|25[0-5])'` and ip regex `ip_regex = /\A#{octet}(?:\.#{octet}){3}\z/` – Wiktor Stribiżew Apr 13 '18 at 09:15

3 Answers3

2

What does {,2}|1 mean for example?

You should be looking at the parts separated by |\d{,2} is a pattern, 1\d{2} is a pattern, etc. Here’s what they mean:

  • \d{,2} – up to 2 digit characters, i.e. numbers from 0 to 99
  • 1\d{2} – the digit 1 followed by 2 digits, i.e. numbers from 100 to 199
  • 2[0-4]\d – 2, then a digit from 0 to 4, then a digit, i.e. numbers from 200 to 249
  • 25[0-5] – 2, 5, and a digit from 0 to 5, i.e. numbers from 250 to 255

When you join them together with |, it’s the pattern matching any of those patterns, i.e. numbers from 0 to 255.

The \d{,2} pattern is a bit wrong because it also matches the empty string and allows a leading zero, which is inconsistent with the other patterns.

If you wanted to check whether an entire string matched the pattern, a correct version would probably be this:

octet = /\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5]/
ip_regex = /\A#{octet}\.#{octet}\.#{octet}\.#{octet}\z/
Ry-
  • 199,309
  • 51
  • 404
  • 420
  • 2
    @Ry Your answer is better than mine, as it mentions empty strings and leading zero. – sawa Apr 13 '18 at 03:32
0

One octet in an IP address (in dotted-octet notation) may not exceed 255.

So given /\d{,2}|1\d{2}|2[0-4]\d|25[0-5]/, break it apart like this: / \d{,2} | 1\d{2} | 2[0-4]\d | 25[0-5] /x

The first snip, \d{,2}, matches a 1 or 2 digit number. The second snip, 1\d{2}, matches any number between 100 and 199. The third snip, 2[0-4]\d, matches any number between 200 and 249. The last snip, 25[0-5], matches any number between 250 and 255. Put them all together, and an octet may be any number between 1 and 255.

Phlip
  • 5,151
  • 2
  • 27
  • 44
0

There is a really cool tool to help understanding regular expressions: https://regexper.com It gives you the finite-state automaton, which is more visual and easy to understand that the regular expression.

For example, for octet you get:

octet

Although with the {,2} is still not very clear. a{,2} means maximum 2, so it is equivalent to {0,2} (between 0 and 2). Changing this in the regular expresion regexper makes it a bit better:

enter image description here

And now I think it is easy to read.

Another good tool to try your regular expression is Rubular.