2

I have been sitting for hours to figure out a regExp for a preg_match_all function in php. My problem is that i whant two different things from the string.

Say you have the string "Code is fun [and good for the brain.] But the [brain is] tired."

What i need from this an array of all the word outside of the brackets and the text in the brackets together as one string.

Something like this

[0] => Code
[1] => is
[2] => fun
[3] => and good for the brain.
[4] => But
[5] => the
[6] => brain is
[7] => tired.

Help much appreciated.

Sebastian
  • 23
  • 5

2 Answers2

3

You could try the below regex also,

(?<=\[)[^\]]*|[.\w]+

DEMO

Code:

<?php
$data = "Code is fun [and good for the brain.] But the [brain is] tired.";
$regex =  '~(?<=\[)[^\]]*|[.\w]+~';
preg_match_all($regex, $data, $matches);
print_r($matches);
?>

Output:

Array
(
    [0] => Array
        (
            [0] => Code
            [1] => is
            [2] => fun
            [3] => and good for the brain.
            [4] => But
            [5] => the
            [6] => brain is
            [7] => tired.
        )

)

The first lookbind (?<=\[)[^\]]* matches all the characters which are present inside the braces [] and the second [.\w]+ matches one or more word characters or dot from the remaining string.

Avinash Raj
  • 160,498
  • 22
  • 182
  • 229
1

You can use the following regex:

(?:\[([\w .!?]+)\]+|(\w+))

The regex contains two alternations: one to match everything inside the two square brackets, and one to capture every other word.

This assumes that the part inside the square brackets doesn't contain any characters other than alphabets, digits, _, !, ., and ?. In case you need to add more punctuation, it should be easy enough to add them to the character class.

If you don't want to be that specific about what should be captured, then you can use a negated character class instead — specify what not to match instead of specifying what to match. The expression then becomes: (?:\[([^\[\]]+)\]|(\w+))

Visualization:

Explanation:

(?:              # Begin non-capturing group
  \[             #   Match a literal '['
    (            #   Start capturing group 1
      [\w .!?]+  #     Match everything in between '[' and ']'
    )            #   End capturing group 1
  \]             #   Match literal ']'
  |              #  OR
  (              #   Begin capturing group 2
    \w+          #     Match rest of the words
  )              #   End capturing group 2
)                # End non-capturing group

Demo

Amal Murali
  • 70,371
  • 17
  • 120
  • 139