384

Does anyone have a regular expression handy that will match any legal DNS hostname or IP address?

It's easy to write one that works 95% of the time, but I'm hoping to get something that's well tested to exactly match the latest RFC specs for DNS hostnames.

Community
  • 1
  • 1
DonGar
  • 6,462
  • 6
  • 26
  • 31
  • Be aware: It's possible to find out if a string is a valid IPv4 address and to find out if it's a valid hostname. But: It's not possible to find out if a string is either a valid IPv4 address or a valid hostname. The reason: Any string that is matched as a valid IPv4 address would also be a valid hostname that could be resolved to a different IP address by the DNS server. – ndsvw Aug 09 '20 at 06:45

21 Answers21

555

You can use the following regular expressions separately or by combining them in a joint OR expression.

ValidIpAddressRegex = "^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$";

ValidHostnameRegex = "^(([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])$";

ValidIpAddressRegex matches valid IP addresses and ValidHostnameRegex valid host names. Depending on the language you use \ could have to be escaped with \.


ValidHostnameRegex is valid as per RFC 1123. Originally, RFC 952 specified that hostname segments could not start with a digit.

http://en.wikipedia.org/wiki/Hostname

The original specification of hostnames in RFC 952, mandated that labels could not start with a digit or with a hyphen, and must not end with a hyphen. However, a subsequent specification (RFC 1123) permitted hostname labels to start with digits.

Valid952HostnameRegex = "^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$";
Community
  • 1
  • 1
Jorge Ferreira
  • 88,967
  • 24
  • 112
  • 131
  • Your hostname regex is pretty good and looks like it matches everything. You should change your answer so it doesn't have the double escaping for periods and hyphens, and the sz which makes it look like some Microsoft language. – Neil Jun 05 '09 at 19:48
  • 3
    Here: http://stackoverflow.com/questions/4645126/looking-for-regex-for-hostname-validation - I explain that names that start with a digit are considered as valid as well. Also, only one dot is questionable issue. Would be great to have more feedback on that. – BreakPhreak Jan 10 '11 at 09:07
  • 16
    You might want to add IPv6. The OP didn't specify *what type* of address. (By the way, it can be found [here](http://stackoverflow.com/questions/53497/regular-expression-that-matches-valid-ipv6-addresses/53499#53499)) – new123456 Feb 27 '11 at 19:28
  • could you please provide a single regular expressions to test both the conditions i.e. hostname and ip? – Zain Shaikh Nov 01 '11 at 11:29
  • @ZainShaikh You can put them together as `()|()`. That's what he says at the top: "by combining them in a joint OR expression". – Matthew Read Feb 03 '12 at 22:02
  • At least in Javascript, this regexp evaluates greedily and matches only the first number of the last octet if it's > 9. Reversing the order of the capture groups of the last segment allows it to properly match full range of IP's. – Fuu Jul 19 '12 at 12:59
  • 1
    I've been using the ValidHostnameRegex to pull domains out of unstructured strings, and it seems that as written this regex in Python only captures the first character of the TLD. Adjusting it to this corrects the issue: `((([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\\-]*[a-zA-Z0-9])\\.)*([A-Za-z0-9][A-Za-z0-9\\-]*[A-Za-z0-9]))` – bradreaves Dec 13 '12 at 02:52
  • 32
    Before people blindly use this in their code, note that it is not completely accurate. It ignores RFC2181: "The DNS itself places only one restriction on the particular labels that can be used to identify resource records. That one restriction relates to the length of the label and the full name. The length of any one label is limited to between 1 and 63 octets. A full domain name is limited to 255 octets (including the separators)." – rouble Feb 08 '13 at 18:15
  • 1
    And what about non-latin host names? – UserControl Feb 14 '13 at 08:08
  • 2
    I think there is something wrong with ValidIpAddressRegex . http://regexr.com?35830 since regular expression engines are eager at the end of the first match it sees 2 and thinks a match.So in the solution I did I am doing the reverse order http://regexr.com?35833 . `((((25[0-5])|(2[0-4]\d)|([01]?\d?\d)))\.){3}((((25[0-5])|(2[0-4]\d)|([01]?\d?\d))))` – narek Jun 15 '13 at 14:42
  • -1, because while it's *goodish* it doesn't adhere to the RFCs as it claims to be. – Alix Axel Jul 21 '13 at 08:34
  • 7
    @UserControl: Non-latin (Punycoded) hostnames must be converted to ASCII form first (`éxämplè.com` = `xn--xmpl-loa1ab.com`) and then validated. – Alix Axel Jul 21 '13 at 08:36
  • Your IP regex disallows leading 0's e.g. `127.000.000.001` (which I have seen though it's daft) or `127.0.0.0000001` (which is even more daft. Is this deliberate? Personally I would consider it valid (and ping on OS X does too). – Partly Cloudy Jul 26 '13 at 23:26
  • why so many up voted this answer I think this is bad Regex, it will only match if you have clean IP list. – ewwink Sep 27 '13 at 05:34
  • 1
    regarding ValidHostnameRegex: according to http://www.ietf.org/rfc/rfc1034.txt, section 3.1 page 7, trailing dots are valid (e.g.: "poneria.ISI.EDU." is a valid host name) - which is not accounted for in this regex. In fact this makes the regex even simpler: "^(([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\\-]*[a-zA-Z0-9])\\.?)+" – rich Oct 02 '13 at 14:45
  • @AlixAxel do you have code to do the conversion of Non-latin hostnames? – Shebuka Oct 11 '13 at 10:19
  • @Shebuka: I would just use something like `idn_to_ascii()` in PHP. – Alix Axel Oct 11 '13 at 20:08
  • Maybe you should match FQDN too. Please add an optional period to the end of all domain name regexes. – schmijos Nov 13 '13 at 13:56
  • 1
    @Partly Cloudy: Leading zeroes are allowed but are interpreted differently. If there is a leading zero in a component, that component is interpreted as octal notation. This is unexpected by most users. – Jon Trauntvein Nov 21 '13 at 17:32
  • 1
    to support trailing dots there could be `\.?` added at the end, there is a absolute represenation used by DNS described in RFC 1034, see http://www.dns-sd.org/TrailingDotsInDomainNames.html – Paweł Prażak Dec 24 '13 at 10:31
  • 1
    Also perhaps consider single-letter hostnames: http://serverfault.com/questions/162038/are-one-letter-host-names-valid – ChaimKut Feb 24 '14 at 08:48
  • @JonTrauntvein, I have seen many places where leadin zeros are admitted in ip addresses in dot decimal notation, but not meaning octal meaning, just plain decimal, as in 192.168.000.028 being equivalent to 192.168.0.28. Why to write a regexp for ipv4 addresses/hostnames when internet is migrating to ipv6 ? – Luis Colorado Sep 18 '14 at 06:36
  • 7
    Your hostname expression is matching some invalid values: I tried `123.456.789.0` and it says it's a valid hostname. – lbarreira Sep 23 '14 at 11:54
  • Seems that your suggested solution accepts IP addresses that starts with zero. I suggest to refactor your solution to: (([1-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){1}(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){2}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5]) – Maxim Kirilov Mar 10 '15 at 11:47
  • 2
    There is a small mistake in Valid952HostnameRegex, I have corrected it: "^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)+([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$" – Milos Gavrilov Jul 09 '15 at 13:14
  • @lbarreira see Alban's [answer below for a realistic regex](http://stackoverflow.com/a/14453696/802365) – Édouard Lopez Apr 22 '16 at 14:47
  • As suggested in other answer. this is working as well as expected: ^([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])(\.([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9]))*$ – Marco Dec 02 '16 at 09:06
  • underscore should be a valid character but I don't think this solution accounts for it. – gunslingor Jan 03 '17 at 15:14
  • That IP regex isn't very good tbh, might want to use mine instead:((1?\d\d?|2[0-4]\d|25[0-5])\.){3}(1?\d\d?|2[0-4]\d|25[0-5]) – Morg. Jan 13 '17 at 09:25
  • 1
    @MilosGavrilov You're the best! Thanks for fixing it! Combined both (IP and Hostname) into a single regexp. See: https://regex101.com/r/0WMysi/2 – Bazardshoxer Feb 16 '17 at 17:20
  • I don't know why it shoud be [a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-] rather than [a-zA-Z0-9][a-zA-Z0-9\-]. – codexplorer Jun 13 '19 at 06:39
68

The hostname regex of smink does not observe the limitation on the length of individual labels within a hostname. Each label within a valid hostname may be no more than 63 octets long.

ValidHostnameRegex="^([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])\
(\.([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9]))*$"

Note that the backslash at the end of the first line (above) is Unix shell syntax for splitting the long line. It's not a part of the regular expression itself.

Here's just the regular expression alone on a single line:

^([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])(\.([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9]))*$

You should also check separately that the total length of the hostname must not exceed 255 characters. For more information, please consult RFC-952 and RFC-1123.

  • 7
    Excellent host pattern. It probably depends on one's language's regex implementation, but for JS it can be adjusted slightly to be briefer without losing anything: `/^[a-z\d]([a-z\d\-]{0,61}[a-z\d])?(\.[a-z\d]([a-z\d\-]{0,61}[a-z\d])?)*$/i` – Semicolon Feb 01 '15 at 23:46
33

To match a valid IP address use the following regex:

(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3}

instead of:

([01]?[0-9][0-9]?|2[0-4][0-9]|25[0-5])(\.([01]?[0-9][0-9]?|2[0-4][0-9]|25[0-5])){3}

Explanation

Many regex engine match the first possibility in the OR sequence. For instance, try the following regex:

10.48.0.200

Test

Test the difference between good vs bad

Ben
  • 4,824
  • 3
  • 24
  • 34
Alban
  • 2,816
  • 5
  • 23
  • 43
  • 5
    Do not forget start ^ and end $ or something like 0.0.0.999 or 999.0.0.0 will match too. ;) – andreas Nov 28 '13 at 13:53
  • 1
    yes to valid a string start ^ and end $ are required, but if you are searching an IP into a text do not use it. – Alban Nov 28 '13 at 15:04
  • The unintended 'non-greedyness' that you identify applies to the other host name solutions as well. It would be worth adding this to your answer as the others will not match the full hostname. e.g. `([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])(\.([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9]))*` versus `([a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9]|[a-zA-Z0-9])(\.([a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])|[a-zA-Z0-9]))*` – ergohack Dec 06 '17 at 18:37
  • EDIT: In the above, use `+` at the end instead of `*` to see the failure. – ergohack Dec 06 '17 at 20:50
6

I don't seem to be able to edit the top post, so I'll add my answer here.

For hostname - easy answer, on egrep example here -- http: //www.linuxinsight.com/how_to_grep_for_ip_addresses_using_the_gnu_egrep_utility.html

egrep '([[:digit:]]{1,3}\.){3}[[:digit:]]{1,3}'

Though the case doesn't account for values like 0 in the fist octet, and values greater than 254 (ip addres) or 255 (netmask). Maybe an additional if statement would help.

As for legal dns hostname, provided that you are checking for internet hostnames only (and not intranet), I wrote the following snipped, a mix of shell/php but it should be applicable as any regular expression.

first go to ietf website, download and parse a list of legal level 1 domain names:

tld=$(curl -s http://data.iana.org/TLD/tlds-alpha-by-domain.txt |  sed 1d  | cut -f1 -d'-' | tr '\n' '|' | sed 's/\(.*\)./\1/')
echo "($tld)"

That should give you a nice piece of re code that checks for legality of top domain name, like .com .org or .ca

Then add first part of the expression according to guidelines found here -- http: //www.domainit.com/support/faq.mhtml?category=Domain_FAQ&question=9 (any alphanumeric combination and '-' symbol, dash should not be in the beginning or end of an octet.

(([a-z0-9]+|([a-z0-9]+[-]+[a-z0-9]+))[.])+

Then put it all together (PHP preg_match example):

$pattern = '/^(([a-z0-9]+|([a-z0-9]+[-]+[a-z0-9]+))[.])+(AC|AD|AE|AERO|AF|AG|AI|AL|AM|AN|AO|AQ|AR|ARPA|AS|ASIA|AT|AU|AW|AX|AZ|BA|BB|BD|BE|BF|BG|BH|BI|BIZ|BJ|BM|BN|BO|BR|BS|BT|BV|BW|BY|BZ|CA|CAT|CC|CD|CF|CG|CH|CI|CK|CL|CM|CN|CO|COM|COOP|CR|CU|CV|CX|CY|CZ|DE|DJ|DK|DM|DO|DZ|EC|EDU|EE|EG|ER|ES|ET|EU|FI|FJ|FK|FM|FO|FR|GA|GB|GD|GE|GF|GG|GH|GI|GL|GM|GN|GOV|GP|GQ|GR|GS|GT|GU|GW|GY|HK|HM|HN|HR|HT|HU|ID|IE|IL|IM|IN|INFO|INT|IO|IQ|IR|IS|IT|JE|JM|JO|JOBS|JP|KE|KG|KH|KI|KM|KN|KP|KR|KW|KY|KZ|LA|LB|LC|LI|LK|LR|LS|LT|LU|LV|LY|MA|MC|MD|ME|MG|MH|MIL|MK|ML|MM|MN|MO|MOBI|MP|MQ|MR|MS|MT|MU|MUSEUM|MV|MW|MX|MY|MZ|NA|NAME|NC|NE|NET|NF|NG|NI|NL|NO|NP|NR|NU|NZ|OM|ORG|PA|PE|PF|PG|PH|PK|PL|PM|PN|PR|PRO|PS|PT|PW|PY|QA|RE|RO|RS|RU|RW|SA|SB|SC|SD|SE|SG|SH|SI|SJ|SK|SL|SM|SN|SO|SR|ST|SU|SV|SY|SZ|TC|TD|TEL|TF|TG|TH|TJ|TK|TL|TM|TN|TO|TP|TR|TRAVEL|TT|TV|TW|TZ|UA|UG|UK|US|UY|UZ|VA|VC|VE|VG|VI|VN|VU|WF|WS|XN|XN|XN|XN|XN|XN|XN|XN|XN|XN|XN|YE|YT|YU|ZA|ZM|ZW)[.]?$/i';

    if (preg_match, $pattern, $matching_string){
    ... do stuff
    }

You may also want to add an if statement to check that string that you checking is shorter than 256 characters -- http://www.ops.ietf.org/lists/namedroppers/namedroppers.2003/msg00964.html

Alex Volkov
  • 2,264
  • 20
  • 25
  • 1
    -1 because this matches bogus IP addresses like “999.999.999.999”. – bdesham Feb 06 '14 at 15:50
  • 1
    "Though the case doesn't account for values like 0 in the fist octet, and values greater than 254 (ip addres) or 255 (netmask)." – Alex Volkov Feb 08 '14 at 23:38
  • I saw that you qualified your answer, yes. I downvoted because that part of your answer is still not useful. – bdesham Feb 09 '14 at 02:49
3

It's worth noting that there are libraries for most languages that do this for you, often built into the standard library. And those libraries are likely to get updated a lot more often than code that you copied off a Stack Overflow answer four years ago and forgot about. And of course they'll also generally parse the address into some usable form, rather than just giving you a match with a bunch of groups.

For example, detecting and parsing IPv4 in (POSIX) C:

#include <arpa/inet.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
  for (int i=1; i!=argc; ++i) {
    struct in_addr addr = {0};
    printf("%s: ", argv[i]);
    if (inet_pton(AF_INET, argv[i], &addr) != 1)
      printf("invalid\n");
    else
      printf("%u\n", addr.s_addr);
  }
  return 0;
}

Obviously, such functions won't work if you're trying to, e.g., find all valid addresses in a chat message—but even there, it may be easier to use a simple but overzealous regex to find potential matches, and then use the library to parse them.

For example, in Python:

>>> import ipaddress
>>> import re
>>> msg = "My address is 192.168.0.42; 192.168.0.420 is not an address"
>>> for maybeip in re.findall(r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}', msg):
...     try:
...         print(ipaddress.ip_address(maybeip))
...     except ValueError:
...         pass
abarnert
  • 313,628
  • 35
  • 508
  • 596
2
def isValidHostname(hostname):

    if len(hostname) > 255:
        return False
    if hostname[-1:] == ".":
        hostname = hostname[:-1]   # strip exactly one dot from the right,
                                   #  if present
    allowed = re.compile("(?!-)[A-Z\d-]{1,63}(?<!-)$", re.IGNORECASE)
    return all(allowed.match(x) for x in hostname.split("."))
Alois Mahdal
  • 9,257
  • 6
  • 47
  • 68
PythonDev
  • 4,149
  • 6
  • 28
  • 37
  • Could you explain this regex? Exactly, what do (?!-), (? – Scit Jan 21 '16 at 12:13
  • 1
    @Scit, those make sure it does not start or end with a "-" character if your regex engine allow their use. For example, [from Python](https://docs.python.org/2/library/re.html) or [from Perl](http://perldoc.perl.org/perlre.html#(%3f%3epattern)). – YLearn Feb 19 '16 at 05:22
1
/^(?:[a-zA-Z0-9]+|[a-zA-Z0-9][-a-zA-Z0-9]+[a-zA-Z0-9])(?:\.[a-zA-Z0-9]+|[a-zA-Z0-9][-a-zA-Z0-9]+[a-zA-Z0-9])?$/
Dharman
  • 21,838
  • 18
  • 57
  • 107
1
"^((\\d{1,2}|1\\d{2}|2[0-4]\\d|25[0-5])\.){3}(\\d{1,2}|1\\d{2}|2[0-4]\\d|25[0-5])$"
zangw
  • 33,777
  • 15
  • 127
  • 153
1

This works for valid IP addresses:

regex = '^([0-9]|[1-9][0-9]|[1][0-9][0-9]|[2][0-5][0-5])[.]([0-9]|[1-9][0-9]|[1][0-9][0-9]|[2][0-5][0-5])[.]([0-9]|[1-9][0-9]|[1][0-9][0-9]|[2][0-5][0-5])[.]([0-9]|[1-9][0-9]|[1][0-9][0-9]|[2][0-5][0-5])$'
aliasav
  • 2,615
  • 2
  • 21
  • 30
1

I think this is the best Ip validation regex. please check it once!!!

^(([01]?[0-9]?[0-9]|2([0-4][0-9]|5[0-5]))\.){3}([01]?[0-9]?[0-9]|2([0-4][0-9]|5[0-5]))$
Prakash Thapa
  • 1,245
  • 13
  • 26
0

I found this works pretty well for IP addresses. It validates like the top answer but it also makes sure the ip is isolated so no text or more numbers/decimals are after or before the ip.

(?<!\S)(?:(?:\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\b|.\b){7}(?!\S)

Andrew
  • 203
  • 2
  • 9
  • I tried a lot but I could not understand 2 things here. 1. \b specifies word boundary Why are we using \b ? which is the boundary? and 2. Why does it work only for {7} From what I understood, I think it should be {4} but, it is not working. Optionally, you could tell about why are you using a non-capturing blocks. – Srichakradhar Dec 25 '13 at 18:04
0

Here is a regex that I used in Ant to obtain a proxy host IP or hostname out of ANT_OPTS. This was used to obtain the proxy IP so that I could run an Ant "isreachable" test before configuring a proxy for a forked JVM.

^.*-Dhttp\.proxyHost=(\w{1,}\.\w{1,}\.\w{1,}\.*\w{0,})\s.*$
0
AddressRegex = "^(ftp|http|https):\/\/([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}:[0-9]{1,5})$";

HostnameRegex =  /^(ftp|http|https):\/\/([a-z0-9]+\.)?[a-z0-9][a-z0-9-]*((\.[a-z]{2,6})|(\.[a-z]{2,6})(\.[a-z]{2,6}))$/i

this re are used only for for this type validation

work only if http://www.kk.com http://www.kk.co.in

not works for

http://www.kk.com/ http://www.kk.co.in.kk

http://www.kk.com/dfas http://www.kk.co.in/

ayu for u
  • 179
  • 1
  • 4
0

try this:

((2[0-4]\d|25[0-5]|[01]?\d\d?)\.){3}(2[0-4]\d|25[0-5]|[01]?\d\d?)

it works in my case.

chiwangc
  • 3,428
  • 16
  • 23
  • 31
seraphim
  • 9
  • 2
0

Regarding IP addresses, it appears that there is some debate on whether to include leading zeros. It was once the common practice and is generally accepted, so I would argue that they should be flagged as valid regardless of the current preference. There is also some ambiguity over whether text before and after the string should be validated and, again, I think it should. 1.2.3.4 is a valid IP but 1.2.3.4.5 is not and neither the 1.2.3.4 portion nor the 2.3.4.5 portion should result in a match. Some of the concerns can be handled with this expression:

grep -E '(^|[^[:alnum:]+)(([0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])\.){3}([0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])([^[:alnum:]]|$)' 

The unfortunate part here is the fact that the regex portion that validates an octet is repeated as is true in many offered solutions. Although this is better than for instances of the pattern, the repetition can be eliminated entirely if subroutines are supported in the regex being used. The next example enables those functions with the -P switch of grep and also takes advantage of lookahead and lookbehind functionality. (The function name I selected is 'o' for octet. I could have used 'octet' as the name but wanted to be terse.)

grep -P '(?<![\d\w\.])(?<o>([0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]))(\.\g<o>){3}(?![\d\w\.])'

The handling of the dot might actually create a false negatives if IP addresses are in a file with text in the form of sentences since the a period could follow without it being part of the dotted notation. A variant of the above would fix that:

grep -P '(?<![\d\w\.])(?<x>([0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]))(\.\g<x>){3}(?!([\d\w]|\.\d))'
0
>>> my_hostname = "testhostn.ame"
>>> print bool(re.match("^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$", my_hostname))
True
>>> my_hostname = "testhostn....ame"
>>> print bool(re.match("^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$", my_hostname))
False
>>> my_hostname = "testhostn.A.ame"
>>> print bool(re.match("^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$", my_hostname))
True
Mohammad Shahid Siddiqui
  • 3,096
  • 1
  • 21
  • 11
0

The new Network framework has failable initializers for struct IPv4Address and struct IPv6Address which handle the IP address portion very easily. Doing this in IPv6 with a regex is tough with all the shortening rules.

Unfortunately I don't have an elegant answer for hostname.

Note that Network framework is recent, so it may force you to compile for recent OS versions.

import Network
let tests = ["192.168.4.4","fkjhwojfw","192.168.4.4.4","2620:3","2620::33"]

for test in tests {
    if let _ = IPv4Address(test) {
        debugPrint("\(test) is valid ipv4 address")
    } else if let _ = IPv6Address(test) {
        debugPrint("\(test) is valid ipv6 address")
    } else {
        debugPrint("\(test) is not a valid IP address")
    }
}

output:
"192.168.4.4 is valid ipv4 address"
"fkjhwojfw is not a valid IP address"
"192.168.4.4.4 is not a valid IP address"
"2620:3 is not a valid IP address"
"2620::33 is valid ipv6 address"
Darrell Root
  • 548
  • 4
  • 16
-1

how about this?

([0-9]{1,3}\.){3}[0-9]{1,3}
p.s.w.g
  • 136,020
  • 27
  • 262
  • 299
Saikrishna Rao
  • 445
  • 5
  • 3
-1

on php: filter_var(gethostbyname($dns), FILTER_VALIDATE_IP) == true ? 'ip' : 'not ip'

sirjay
  • 1,565
  • 2
  • 25
  • 47
  • 2
    While this code may answer the question, generally _explanation alongside_ code makes an answer much more useful. Please [edit] your answer and provide some context and explanation. – Sebastian Simon Jan 11 '16 at 18:21
  • 1
    And, unless I'm mistaken, FILTER_VALIDATE_IP is a PHP only value. – DonGar Jan 24 '16 at 23:30
-2

I thought about this simple regex matching pattern for IP address matching \d+[.]\d+[.]\d+[.]\d+

Dody
  • 117
  • 2
  • 2
  • 4
  • 1111.1.1.1 is not a valid ip. There's no way to really test an ip format if you don't take care about subnets. You should at least take care about the number of appearances with something like `^\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}` and of course that will not be the correct way. If you have a languaje to write script, for sure you'll have access to it's network functions. Best way to check an REAL ip it's to tell the system to convert and ip to it's right format then check for true/false. In case of Python i use `socket.inet_aton(ip)`. Case of PHP u need `inet_aton($ip)`. – m3nda Jun 05 '16 at 16:18
  • Python users can take a look here: https://gist.github.com/erm3nda/f25439bba66931d3ca9699b2816e796c – m3nda Jun 05 '16 at 16:21
-2

Checking for host names like... mywebsite.co.in, thangaraj.name, 18thangaraj.in, thangaraj106.in etc.,

[a-z\d+].*?\\.\w{2,4}$
kapa
  • 72,859
  • 20
  • 152
  • 173
Thangaraj
  • 31
  • 5
  • 3
    -1. The OP asked for something “well tested to exactly match the latest RFC specs”, but this does not match e.g. *.museum, while it will match *.foo. [Here’s a list](http://data.iana.org/TLD/tlds-alpha-by-domain.txt) of valid TLDs. – bdesham Feb 06 '14 at 15:53
  • I'm not sure it's a good idea to put the plus inside the character class (square brackets), furthermore, there are TLDs with 5 letters (**.expert** for example). – Yaron Aug 30 '14 at 15:52
  • Best way to accomplish with RFC is to use the system/language functions. `inet_aton` is good enough. – m3nda Jun 05 '16 at 16:20