0

I'm using the following regular expression to check if URLs are valid

var re = /^(http[s]?:\/\/(www\.)?|ftp:\/\/(www\.)?|www\.){1}([0-9A-Za-z-\.@:%_+~#=]+)+((\.[a-zA-Z]{2,3})+)(/(.)*)?(\?(.)*)?/;
var is_valid = re.test(input_url);

It works with small inputs, but starts to run endless with larger inputs. Consider the following 64-characters input

re.test("http://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")

Running this won't complete within minutes when running with an up-to-date Google Chrome.

Is there a problem with the regular expression?

muffel
  • 6,043
  • 4
  • 42
  • 77
  • 4
    [Catastrophic backtracking](http://www.regular-expressions.info/catastrophic.html) with nested quantifiers in `([0-9A-Za-z-\.@:%_+~#=]+)+` ... Remove one of the `+`... Change to `([0-9A-Za-z-\.@:%_+~#=]+)` – Mariano Oct 07 '15 at 06:52
  • 2
    There are other bottlenecks here, the problem is one: nested quantifiers of various types. – Wiktor Stribiżew Oct 07 '15 at 06:56
  • @Mariano great, thank you! – muffel Oct 07 '15 at 06:56
  • 1
    Besides the backtracking issues, this regex doesn't come close to verifying a URL. For example, these would be valid: `www.........abc!!!!!!!!!!!!`, `www.@.foo`, `www.abc.com'; drop table user;` –  Oct 07 '15 at 07:02
  • There's also an unscaped `/` too. I doubt it works. – Mariano Oct 07 '15 at 07:05

1 Answers1

1

The hanging is due to backtracking, as Mariano mentioned in the comments. A regex that has multiple quantifiers such as * and + can result in there being way too many possible permutations of potential matches, and the engine hangs forever while trying to explore them all when a string doesn't match.

However, beyond this the regex has multiple problems and is not fit for purpose. I suggest you start over with one of the methods from previous questions on this topic:

Trying to Validate URL Using JavaScript

Javascript regular expression to validate URL

Community
  • 1
  • 1