0

I'm trying to use regex to validate user entered URLs. I came up with this regex:

function is_valid_url(url)
{
     return url.match(/^(ht|f)tps?:\/\/[a-z0-9-\.]+\.[a-z]{2,4}\/?([^\s<>\#%"\,\{\}\\|\\\^\[\]`]+)?$/);
}

It works fine for most of the simple URLs. However, when I tried to enter this URL from Google Maps:

http://maps.google.com/maps?f=d&source=s_d&saddr=Brooklyn%2C+NY&daddr=Stewart+Ave&hl=en&geocode=FRBFbAId0JyX-ykJIXyUFkTCiTGGeAAEdFx2gg%3BFcAqbQIdgPuX-w&mra=mift&mrsp=1&sz=12&sll=40.65%2C-73.95&sspn=0.182857%2C0.308647&g=Brooklyn%2C+New+York%2C+NY%2C+United+States&ie=UTF8&z=12

Then the function returns false, even though this URL is correct.

I know using regex for URL validation is controversial as there's no perfect solution for it, but I want to know if you have any regex that works better than mine, and can return true for that kind of URL.

  • 1
    What makes you believe all TLDs have 2-4 characters? Also, your regex doesn't support subdomains - so even a `www.` in the URL would break it – ThiefMaster May 18 '11 at 12:06
  • possible duplicate of [url validation using javascript](http://stackoverflow.com/questions/1303872/url-validation-using-javascript) – kapa May 18 '11 at 12:07
  • 1
    @ThiefMaster: I feel sorry for .museum, it never gets any love from URL regexes. – Andy E May 18 '11 at 12:08
  • Also why do you believe that # and %, makes invalid links? – enoyhs May 18 '11 at 12:08
  • Please try it first, it works well with subdomains and www. :) –  May 18 '11 at 12:09
  • Look at the complete regular expression for url validation. http://internet.ls-la.net/folklore/url-regexpr.html – Satish May 18 '11 at 12:08
  • Have a look at [parseUri 1.2: Split URLs in JavaScript](http://blog.stevenlevithan.com/archives/parseuri) this should give you a good library to handle validation. – Nick Weaver May 18 '11 at 12:08

2 Answers2

1
^((http|https|ftp):\/\/)?([a-z]+\.)?[a-z0-9-]+(\.[a-z]{1,4}){1,2}(/.*\?.*)?$

Matches

http://www.example.com
www.example.com
example.com
example.info
abc.com.uk
www.example.co.in
www.example.com.sg
example.com.sg
t.com
co.co
https://www.t.co
asd.com.io/abc?foo=blah

False positives

abc.com.sg.in
example.com.aero.uk
0

Easiest option: use a regex that works.

(((http|ftp|https):\/\/)|www\.)[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&amp;:/~\+#!]*[\w\-\@?^=%&amp;/~\+#])?

Regexr: http://regexr.com?2tpo8

Gary Green
  • 20,931
  • 6
  • 45
  • 74