1

I'm kinda new to regex, and specifically, I don't understand there are 2 backslashes? I mean, I know the second one is to escape the character "*", but what does the first backslash do?

Well I'm passing this regex expression to the php function preg_match(), and I'm trying to find strings that include 2 or more consecutive "*".

4 Answers4

3

That regex is invalid syntax.

You have this piece:

*{2,}

Which basically would read: match n-times, 2 or more times.


The following regex:

/\\*.{2,}/

Is the simplest and closest regex to the one you have, which would read as:
match 0 or more '\' and 2 or more characters that aren't newlines


If you are talking about the string itself, is may be interpreted as 2 things:

  • /\\*{2,}/
    Read as: match a single \ and another \ n-times 2 times or more
    This is invalid syntax
  • /\*{2,}\
    Read as match 2 or more * This is valid syntax

It all varies, depending on the escape character.


Edit:

Since the question was updated to show which language and engine it is being used, I've updated to add the following information:

You have to pass the regex as '/\*{2,}/' OR as "/\\*{2,}/" (watch the quotes).

Both are very similar, except that single quotes ('') only support the following escape sequences:

  • \' - Produces '
  • \\- Produces \

Double-quoted strings are treated differently in PHP. And they support almost any escape sequence, like:

  • \" - Produces "
  • \' - Produces '
  • \\ - Produces \
  • \x<2-digit hex number> - Same as chr(0x<2-digit hex number>)
  • \0 - Produces a null char
  • \1 - Produces a control char (same as chr(1))
  • \u<4-digit hex number> - Produces an UTF-8 character
  • \r - Produces a newline on old OSX
  • \n - Produces a newline on Linux/newer OSX/Windows (when writting a file without b)
  • \t - Produces a tab
  • \<number> or \0<number> - Same as \x, but the numbers are in octal (e.g.: "\75" and "\075" produce =)
  • ... (some more that I probably forgot) ...
  • \<anything> - Produces <anything>

Read more about this on https://php.net/manual/en/language.types.string.php

Community
  • 1
  • 1
Ismael Miguel
  • 3,941
  • 1
  • 27
  • 35
  • AFAIK it is valid, but doesn't do much – PeeHaa May 08 '15 at 17:32
  • Please note, that OP is not showing us a regex, but a regex string, that needs to have the escape character escaped. See my answer. – CptBartender May 08 '15 at 17:34
  • 1
    @PeeHaa You can test on https://regex101.com/ and http://www.gethifi.com/tools/regex. Both will barf when you throw that bad boy. – Ismael Miguel May 08 '15 at 17:34
  • I see. What engine are they on? Because rubular happily runs with it http://rubular.com/r/HcUYkFkAfE – PeeHaa May 08 '15 at 17:38
  • 1
    To answer my own question. They all error. Even the PCRE one – PeeHaa May 08 '15 at 17:40
  • @PeeHaa With the new edit, I think I nailed the answer. – Ismael Miguel May 08 '15 at 17:51
  • Haha thanks man but what difference does it make to have "" vs. ' '? One is string and the other is, well, char? – theOneAndOnlyFerris May 08 '15 at 17:59
  • @theOneAndOnlyFerris `""` will parse all character escapes and variables. While `''` won't. Using `''` will only detect the following escape sequences: `\'` and '\\'. All the others are only available with `""` (and heredocs). You can read more on https://php.net/manual/en/language.types.string.php – Ismael Miguel May 08 '15 at 18:04
1

Is it a string literal written in a program and if so which one? The double backslash may be to escape the escape char so that this regex matches at least 2 * star characters.

In JavaScript for example you need to escape the \ so that your string literal can express it as data before you transform it into a regular expression when using the RegExp constructor. Why do regex constructors need to be double escaped?

Community
  • 1
  • 1
sebnukem
  • 7,217
  • 5
  • 35
  • 45
1

Depending on the platrofm you're using, "/\\*{2,}/" may actually be a representation of a /\*{2,'}/ string - this is because languages like Java treat \ as an escape character, so to actually put that character within regex, you need to escape the character in regex string.

So, we have /\*{2'}/ regex. \*' matches the star character, and{2,}` means at least two times. Your regex will match any two or more consecutive star characters.

CptBartender
  • 1,215
  • 7
  • 21
0

For PHP what you have with that regex is to repeat literally a * 2 or more times. You can easily see with with below diagram:

Regular expression visualization

But when you have to code it in PHP you have to escape the backslash (with a backslash) to use it in string. For instance:

$re = "/\\*{2,}/"; 
$str = "..."; 

preg_match($re, $str, $matches);
Federico Piazza
  • 27,409
  • 11
  • 74
  • 107