-1

In some patterns I see a colon : is used. For example:

(?:"[^"]*[^-]?>)|(?:[^\w\s]\s*\/>)|(?:>")

This should match something that has : but also matches:

"><script>alert("hello")</script>

That has no colons in it. Does colon have a special meaning in this context? Can anyone explain about the matter?

Edit I think those link explain overall all things about regex. that dose not make seance about my topic. @chris85 and @sin given the answer. Thanks

Shahadat Hossain
  • 677
  • 5
  • 19

1 Answers1

3

Advanced regular expressions need special constructs to propagate meaning.

By far, the most advanced regex construct is the simple open/close grouping
(..) and are delimiters.

It denotes a start and end to a group of constructs that are scoped.

This construct is sub-divided into special forms with the addition of certain
characters added to the open delimiter (.
It tells the engine what this group represents.
For ease of use, to denote the start of a complex construct, the question mark ? follows the open paren, like so (?.
What follows is a character or characters that uniquely identifies what the group does.

Here is an (incomplete) list of open delimited grouping constructs.

( capture group
(? modifier group
(?: cluster group
(?# comment group
(?| branch reset group
(?' named capture group
(?< named capture group
(?> atomic group

(?= positive lookahead assertion group
(?! negative lookahead assertion group
(?<= positive lookbehind assertion group
(?<! negative lookabehind assertion group

(?& recursion group
(?( conditional group
(?* backtracking control group

(?{ assertion code group
(??{ regex injection code group
(?C callout code group

All these group delimiters must be supported by the engine to be recognized.

You can see that the only metacharacters used in this open delimiter sequence
are the ( and the ?.
Used together, in sequence though, they form the start of a special grouping construct.

Sometimes, visually you might miss these hardcoded constructs.
So it is in the user's interest to know these ahead of time so you don't
get confused when you see something like (?::?).

Hope this helps.


Just a hint here.
The fact that they use a ( then ? to denote the opening of an advanced
grouping construct was a very clever idea.

If a particular grouping construct is not supported by an engine,
it will assume the ? to be a quantifier.

However, you can't quantify an open paren ( when it is a metacharacter.
I.e. not an escaped literal
\(? <- OK quantified literal,
(? <- advanced grouping construct or BAD if not supported.

The result is, any advanced construct starting with (? that is not supported
will cause the engine to automatically throw an error such as:
'Illegal quantification of metacharacter' or 'Unsupported construct' .