I am new to Regex. I want to match a certain URL pagePath pattern for Analytics.
The Problem:
The pattern looks like this:
/(de|en|fr|it)/../any-word-including-dashes/word-or-words-including-dashes-and-numbers
I want to match only this pattern and exclude all pagePathes with another forward slash or not matching the initial pattern:
Include:
/de/ab/word-word/word1-and-something-else
/de/ab/word-word/word1-and-something-else?any_ting1=any.-thing2
Exclude:
/de/ab/word-word/word1-and-something-else/
/de/ab/word-word/word1-and-something-else/anything
/de/ab/word-word
/fr/moreThanTwoCHAR/anything
My Regex:
After having searched on SO (Exclude forward slash before end , "Match anything but" and Finding exactly n occurences of "/", disallow 0 or more occurences of a CHAR) I came up with the following regex:
^(\/de|\/fr|\/en|\/it)\/..\/.+\/\w+[^\/]*
What it does correctly
It excludes correctly the following path:
/fr/moreThanTwoCHAR/anything
What it fails on
The problem with the above regex is that it matches also (tested on regex101):
/de/ab/word-word/word1-and-something-else/anything
And I can't seem to understand why it matches the string with an additional forward slash even if I stated to exclude 0 or more additional occurences (at least from what I understood). Anyone can explain me where I'm mistaken?