1

I am trying to write a regular expression that will match domains in a sentence.

I found this post which was very useful and helped me create the following to match domains, but it also unfortunately matches IP addresses too which I do not want:

((?!-))(xn--)?[a-z0-9][a-z0-9-_]{0,61}[a-z0-9]{0,1}\.(xn--)?([a-z0-9\._-]{1,61}|[a-z0-9-]{1,30})

I want to update my expression so that the following can still be found: in a sentence, between brackets, etc.:

www.example.com
subdomain.example.com
subdomain.example.co.uk

But not:

192.168.0.0
127.0.0.1

Is there a way to do this?

Twiggy
  • 65
  • 1
  • 5

2 Answers2

2

We could use a simple lookahead that excludes combinations of numbers and dots only: (?![\d.]+)

(?![\d.]+)((?!-))(xn--)?[a-z0-9][a-z0-9-_]{0,61}[a-z0-9]{0,1}\.(xn--)?([a-z0-9\._-]{1,61}|[a-z0-9-]{1,30})

Demo

wp78de
  • 16,078
  • 6
  • 34
  • 56
1

Answer from @wp78de is correct, however it would not detect the domains starting with Numerical digits i.e. 123reg.com

So remove the first group in the regex like this

((?!-))(xn--)?[a-z0-9][a-z0-9-_]{0,61}[a-z0-9]{0,1}\.(xn--)?([a-z0-9\._-]{1,61}|[a-z0-9-]{1,30})
Sahil
  • 1,865
  • 6
  • 21
  • 41