0

I'm doing a project which requires me to match URLs . I'm horrible at regex. But i believe the structure goes as follows -

  • letters and symbols , except space that that can be repeated 1 to any number of times,
  • immediately followed by a dot 0 or 1 times,
  • followed by letters and symbols , except space, 1 to many times,
  • followed by a dot " ." ,
  • followed by a list of valid extensions (like com ,org,in (will be specified here))
  • followed by a "/" 0 or 1 times
    • if 0, then immediately followed by space
    • if 1 , then words and symbols except

how do I form the regex for this

Prateek Narendra
  • 1,514
  • 1
  • 26
  • 50
  • 1
    @AvinashRaj This [Regex](http://rubular.com/r/Bqh38VDz50) i tried. Only to realise that it accepts string like that. Then I realised I can match the .com,.org,in and all.. – Prateek Narendra Sep 21 '14 at 13:20
  • Ruby uses perl style regexes, [the url spec](http://www.w3.org/Addressing/URL/url-spec.txt) is more difficult to regex than you might think. You may be better using one someone else has built for you like the answer suggested above if your not sure. – ShaneQful Sep 21 '14 at 13:21
  • @ShaneQful doesnt take care of URLs like fb.me 9gag.com. Just need to match URLs found in twitter and http://*.*.*/* and \*.\*.\*/* (no spaces) – Prateek Narendra Sep 21 '14 at 13:27
  • @AvinashRaj I tried this [regex](http://rubular.com/r/ff3QDTbDlG) . But epic fail in matching – Prateek Narendra Sep 21 '14 at 13:35
  • why the python and ruby tags? – Brad Werth Sep 25 '14 at 05:12

2 Answers2

1

You could try the below regex to match the URL's which satisfy the above criteria.

(?:https?:\/\/)?[^\W\s_]+\.?[^\W\s_]+\.(?:com|org|me)(?:\/[^\W\s_]+)?

DEMO

Avinash Raj
  • 160,498
  • 22
  • 182
  • 229
1

Regex is not suited very well to parse grammars and validate input. Regex is just meant for string pattern matching.

Use a parser for validating the syntax of a input, in your case try ruby's URI. It's part of the 1.8.7 default libary

dfherr
  • 1,503
  • 10
  • 21