-1

i try to read vallid url from a document with regex, but it doesnt work as i expect i got this regex

https?:\/\/?[-a-zA-Z0-9@:%._\+~#=\[\]]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\ +.~#?&//=]*)

but if i try to read for example "https://www.example.com/folder/folder/document.pdf" it works, but if i try to read "https://www.example.com/folder/folder/document[first attempt].pdf" it doesnt match. debugger says : "\[ matches the character [ literally (case sensitive)"

FYI: i tried in on http://regexr.com/

Jeroen
  • 209
  • 2
  • 3
  • 13

1 Answers1

3

You need just to add \[\]

https?:\/\/?[-a-zA-Z0-9@:%._\+~#=\[\]]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9\[\]@:%_\ +.~#?&//=]*)

You can find some interesting url matching regex here and here

54l3d
  • 3,782
  • 2
  • 24
  • 46
  • that works but.., thats in the part after the dot ? why does it have to be there too ? – Jeroen Jun 21 '17 at 09:56
  • 1
    because, the first class does not contains `/` hence the dot will match the last dot in the domain name but not the dot of the file extension – 54l3d Jun 21 '17 at 10:01
  • 1
    thank you for the explaination :-) , maybe i should thurn the aircondition a little lower – Jeroen Jun 21 '17 at 11:19