why does regex with \[ not work

Question

i try to read vallid url from a document with regex, but it doesnt work as i expect i got this regex

https?:\/\/?[-a-zA-Z0-9@:%._\+~#=\[\]]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\ +.~#?&//=]*)

but if i try to read for example "https://www.example.com/folder/folder/document.pdf" it works, but if i try to read "https://www.example.com/folder/folder/document[first attempt].pdf" it doesnt match. debugger says : "\[ matches the character [ literally (case sensitive)"

FYI: i tried in on http://regexr.com/

`/` does not have to be doubled in a character class, one is enough. — Wiktor Stribiżew, Jun 21 '17 at 09:46

54l3d · Accepted Answer · 2017-06-21T10:04:55.413

3

You need just to add \[\]

https?:\/\/?[-a-zA-Z0-9@:%._\+~#=\[\]]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9\[\]@:%_\ +.~#?&//=]*)

You can find some interesting url matching regex here and here

edited Jun 21 '17 at 10:04

answered Jun 21 '17 at 09:46

54l3d

that works but.., thats in the part after the dot ? why does it have to be there too ? – Jeroen Jun 21 '17 at 09:56
1

because, the first class does not contains `/` hence the dot will match the last dot in the domain name but not the dot of the file extension – 54l3d Jun 21 '17 at 10:01
1

thank you for the explaination :-) , maybe i should thurn the aircondition a little lower – Jeroen Jun 21 '17 at 11:19

1 Answers1