3

What is the exact difference between .* and [.]*?

I have tried to put these inside parentheses, for back-reference, and saw the results are not the same although I don't understand why.

The . is supposed to match any single character.

So I guess whether it's inside square brackets or not should not be important with the * (match zero or more) operator. But it is. Why?

Laurel
  • 5,522
  • 11
  • 26
  • 49
carmellose
  • 4,146
  • 7
  • 35
  • 46

2 Answers2

8

In .*, a dot is a special character matching any character but a newline (it will match all characters if you specify a DOTALL modifier). In a character class ([.]), a dot loses its special meaning and starts matching a literal dot.

It is a universal behavior across all regex flavors.

Wiktor Stribiżew
  • 484,719
  • 26
  • 302
  • 397
  • Ok then how do you specify: any number of any (`.`) or white space (`\s`) chars? e.g. `1([.\s]*)2` capture the group between 1 and 2 in multi-line regex? – user14492 Jun 29 '20 at 18:25
  • @user14492 [Your regex](https://regex101.com/r/nJ36I2/1) matches any 0 or more dots or whitespaces between `1` and `2`. If you want to match any text between `1` and `2`, you need to refer to [How do I match any character across multiple lines in a regular expression?](https://stackoverflow.com/questions/159118) and use `(?s)1(.*)2` / `(?s)1(.*?)2` or `1([\w\W]*)2` / `1([\w\W]*?)2`. There may be other options, like `1([^]*)2` / `1([^]*?)2` in JavaScript. – Wiktor Stribiżew Jun 29 '20 at 18:37
  • Thx it's not always related to muti-line; that was an example off top of my head. I wanted to also know what is the correct form of any number of following all (`.`) and specific chars (`\t`). Because when you put the dot (`.`) in a pick group (square brackets) e.g. `[.\t]+` it doesn't mean the same thing? Because currently I have to replace all dots in square brackets with `[\t.]+ === [\ta-zA-Z0-9]`? – user14492 Jun 30 '20 at 14:23
  • @user14492 Matching any character + a specific character = matching any character. – Wiktor Stribiżew Jun 30 '20 at 14:28
  • Not for white space chars like `\n, \r, \t, \s`? – user14492 Jun 30 '20 at 14:30
  • @user14492 `\s` is not a whitespace char, it is a whitespace char pattern, and it matches all of them. No idea what you mean. Provide a test case if you need help. – Wiktor Stribiżew Jun 30 '20 at 14:35
2
  • . matches any character, apart from newline.
  • \. only matches a literal ".".
  • [.] is equivalent to [\.] or \. This is just for convenience - because you almost certainty don't want it to match "any character", in the context of a character group.

Bonus -- If you use my ruby gem, you can easily experiment with stuff like this:

/./.examples # => ["a", "b", "c", "d", "e"]
/\./.examples # => ["."]
/[.]/.examples # => ["."]
Tom Lord
  • 22,829
  • 4
  • 43
  • 67