not able to understand preg_match_all()

Question

This question already has a duplicate question (Duplicate question)
I'm not able to create a comment there, so i'm creating a new one.

The solution provided is very explanatory, but I am still not able to get a clear view of preg_match_all() .

I tried the following code

preg_match_all("/#+([a-zA-Z0-9_]+)/i","#test this is #php test",$matches);

var_dump($matches)

The result is

array(2) {
  [0]=>
  array(2) {
    [0]=>
    string(5) "#test"
    [1]=>
    string(4) "#php"
  }
  [1]=>
  array(2) {
    [0]=>
    string(4) "test"
    [1]=>
    string(3) "php"
  }
}

My understanding is that the regex will only select the string starting with '#' as per my code.

But in the result, the array contains the string with '#' and without '#'.

Please help me figure this out. What am I missing.

`My understanding is that the regex will only select the starting with '#' as per my code` - thats because your understanding is wrong. You need to add the start of string `^` to the front of it. `/^#+([a-zA-Z0-9_]+)/i` The caret `^` when outside of a character group `[...]` means match the start of the `string/line`. Inside the `[^0-9]` means `Not`, it's confusing. But that is how it works. Also you can replace `[a-zA-Z0-9_]` with `\w` which is the same thing. And the `i` flag is pointless here, which case insensitive, as you have both upper and lowercase. That gives us `/^#+(\w+)/` — ArtisticPhoenix, Mar 14 '19 at 07:18
In the first array there is result of the whole regexp. In the next array there are results of each bracket (in your case `[a-zA-Z0-9_]+`). — Pavel Třupek, Mar 14 '19 at 07:23
Conversely the `$` matches the end. I would have posted it as a answer, but I saw the duplicate link. For the matches they are indexed by the capture groups `(...)` starting at `1`. The `0` just like in `preg_match` is the "full match" — ArtisticPhoenix, Mar 14 '19 at 07:24
Take a look at the section on groups in the duplicate. Your `[0]` array elements match the entire regex i.e. `#+([a-zA-Z0-9_]+)` => `['#test', '#php']`, the `[1]` elements match just the contents of the `()` i.e. `[a-zA-Z0-9_]+` => `['test','php']` — Nick, Mar 14 '19 at 07:24
@ArtisticPhoenix insight on `^` was a new information. thank you for that. — Braike dp, Mar 14 '19 at 07:26
Sure, Regex can be very confusing. I am pretty good at it now after some years of struggling, although lookarounds still get me once in a while. Regex is extremely powerful, so it's worth the work to learn it. A good testing site is this, https://regex101.com/ Mainly because you can save your stuff, and it has built in documentation and a great regex parser. it doesn't do everything correctly from a PHP standpoint, such as there is no `preg_match_all` but you can get close. — ArtisticPhoenix, Mar 14 '19 at 07:30
@Nick ok. but, in that case why did it skip the strings without # `this is`. — Braike dp, Mar 14 '19 at 07:31
It skipped them due to the lack of both the `#` and a `\s` for spaces. What your code does is match any part that starts with a `#` and is followed `a-z`, `A-Z`, `0-9` or `_`. No spaces. — ArtisticPhoenix, Mar 14 '19 at 07:32
@ArtisticPhoenix Nick Thank you for the help. That is very informative and clarified my doubt. — Braike dp, Mar 14 '19 at 07:34
Here is your Regex in the Testing site I mentioned above, https://regex101.com/r/5Sm864/1 if you look on the right it gives you a full explanation of what it does. Which is why I like this tester so much. Note the `g` flag is more of a Javascript Regex thing, then a PHP thing. this is one of those things I said it's not 100% correct in PHP. — ArtisticPhoenix, Mar 14 '19 at 07:35

not able to understand preg_match_all()

0 Answers0