0

I am looking for the best way to do this... does some handy regex exist? Or I should play with it in cycle somehow section by section?

Okay I have such a sentence:

"The rooms rooms and rooms again were great, the food was not but the beds were extremely comfortable."

I have an array of items (delimeters):

 array('food','room','bed');

I would like to somehow magically obtain the sections of sentence between these words...like split it (from one delimeter to another) if it's understandable...

The first section:

"The"

The second section (until closest item from array (delimeter):

"rooms "

The third section:

"rooms and "

The fourth section:

"rooms again were great, the"

And the fifth section:

"food was not but the ".

And the fourth section:

"beds were extremely comfortable."

Basically split the sentence from one key word to another repeatedly.

The point of delimeters is to split the sentence... so just match it... if in sentence there is a word "rooms" it matches the delimeter "room". Plural is not important, the point is to split the sentence to multiple sections based on delimeters (items from array).

Any idea please?

Lukas Lukac
  • 5,806
  • 10
  • 57
  • 69

1 Answers1

4

Could split using a lookahead:

$pattern = '/(?=room|food|bed)/i';

$str = "The rooms rooms and rooms again were great, the food was not but the beds were extremely comfortable.";

print_r(preg_split($pattern, $str));

output (test @ eval.in)

Array
(
    [0] => The 
    [1] => rooms 
    [2] => rooms and 
    [3] => rooms again were great, the 
    [4] => food was not but the 
    [5] => beds were extremely comfortable.
)

Used i (PCRE_CASELESS) modifier. Might want to add \b word-boundaries to some of the words.

Also see: test at regex101, regex faq

Community
  • 1
  • 1
Jonny 5
  • 11,051
  • 2
  • 20
  • 42
  • 1
    Seems like this is exactly what I needed will play with it a little be to be sure and then I will mark it as an answer. Thank you Johnny for now! – Lukas Lukac May 17 '14 at 15:18
  • @Trki Welcome! Also see [this example @ eval.in](https://eval.in/152454) with your array. If there are [regex metacharcters](http://www.hscripts.com/tutorials/regular-expression/metacharacter-list.php) contained in the words, need to [preg_quote](http://www.php.net/manual/en/function.preg-quote.php) the items. – Jonny 5 May 17 '14 at 15:46
  • 1
    Love zero-width splits, +1 :) – zx81 May 18 '14 at 03:16
  • @Jonny5 is there any way I could select the whole word? Example: $pattern = '/(?=room|food|bed)/i'; $string = "The bedrooms are great." Split -> [0] => the, [1] => bedrooms are great. ? – Lukas Lukac May 21 '14 at 07:54
  • @Trki Possibly want to add `\b` [word-boundaries](http://www.regular-expressions.info/wordboundaries.html) [before](http://regex101.com/r/xI1iN5) [and](http://regex101.com/r/fH2jY6)/or after. See [/(?=\b(?:room|bed|food))/i](http://regex101.com/r/xI1iN5) with a word-boundary before each of the words inside the `(?:` non-capturing [group](http://www.regular-expressions.info/brackets.html). Also of course you can add the boundaries to single words in the alternation, wherever: [/(?=room\b|\bbed|food)/i](http://regex101.com/r/uC1rM4). – Jonny 5 May 21 '14 at 09:05
  • 1
    Aaah I was using wrong syntax all the time that's why it wasn't working for me. Thx again! You are the best :) – Lukas Lukac May 21 '14 at 14:16