-1

Trying to write a regex to match URLs in the following format

https://www.test.com/module/1dfce6a564rn184930d829205943373e https://www.test.com/directory/67a58dc9165ti206461d34fe1783a7e1

There will always be one subdirectory after the domain. They will always end in a 32 alphanumeric character string.

Thanks in advance

  • what have you tried? 2nd line have 33 alphanumeric character – Kristian Feb 16 '21 at 11:15
  • try this: ```.+\/([a-zA-Z0-9]{32})``` on https://regex101.com/ – Kristian Feb 16 '21 at 11:16
  • _There will always be one subdirectory after the domain_ then try: ```https:\/\/.+\/.+\/([a-zA-Z0-9]{32})``` – Kristian Feb 16 '21 at 11:17
  • I'm not great with regex but tried this so far ((http[s]?|ftp):\/)?\/?([^:\/\s]+)((\/\w+)*\/)([0-9a-z]{32}) and it's not working correctly – user15219953 Feb 16 '21 at 11:21
  • Does this answer your question? [Reference - What does this regex mean?](https://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean) – knittl Feb 16 '21 at 11:25

1 Answers1

0

Let me summarize your problem:

  • url must be http/https/ftp
  • url must have 1 directory before file name
  • file name in url must be 32 alphanumeric character

Your attempt: ((http[s]?|ftp):\/)?\/?([^:\/\s]+)((\/\w+)*\/)([0-9a-z]{32})

Test string:

Your approach is already correct. You just need to make sure that after those 32 characters, there's no more character. So you need to add $ in the end.

Also, the schema (http/https/ftp) is in capture group. That means if there's a schema (http/https/ftp), it will be captured, but it doesn't have to exist. So, just add a 1 quantifier to the schema capture group {1}.

Answer: ((http[s]?|ftp):\/){1}?\/?([^:\/\s]+)((\/\w+)*\/)([0-9a-z]{32})$

enter image description here

enter image description here

Another possible improvement is adding ^ at the start. This will make sure that line that does not begin with http/https/ftp will not match.

Regex: ^((http[s]?|ftp):\/){1}?\/?([^:\/\s]+)((\/\w+)*\/)([0-9a-z]{32})$

Will not match: test.com/directory1/directory2/67a58dc9165ti206461d34fe1783a7e1

For trying your regex, you can use online tool regex101.com. It explains what your regex means in right panel. For example, mine was explained as such

enter image description here

enter image description here

Kristian
  • 1,442
  • 2
  • 15
  • 16