0

Hi I am trying to construct a regular expression (PCRE) that is able to find two words near each other but which occur on the same line. The near examples generally provided are insufficient for my requirements as the "\W" obviously includes new lines. I have spent quite a bit of time trying to find an answer to this and have thus far been unsuccessful. To exemplify what I have so far, please see below:

(?i)(?:\b(tree)\b)\W+(?:\w+\W+){0,5}?\b(house)\b.*

I want this to match on:

here is a tree with a house

But not match on

here is a tree 
with a house

Any help would be greatly appreciated!

nhahtdh
  • 52,949
  • 15
  • 113
  • 149
  • "which occur on the same line". Don't know about you, but I'm pretty sure the second example is 2 lines… (o.O) ;-) – iain Sep 19 '14 at 10:09
  • What do you want to accomplish? Get the nearest/closest matches of these two words, but only if both occur on the same line? – Jonny 5 Sep 19 '14 at 11:08

6 Answers6

0

How about

\btree\b[^\n]+\bhouse\b
llogiq
  • 11,855
  • 3
  • 36
  • 65
0

Just add a negative lookahead to match all the non-word characters but not of a new line character.

(?i)(?:\b(tree)\b)(?:(?!\n)\W)+(?:\w+\W+){0,5}?\b(house)\b.*

DEMO

Avinash Raj
  • 160,498
  • 22
  • 182
  • 229
0

Dot matches anything except newlines, so just:

(?i)\btree\b.{1,5}\bhouse\b

Note it is impossible for there to be zero characters between the two words, because then they wouldn't be two words - they would be the one word and the \b wouldn't match.

Bohemian
  • 365,064
  • 84
  • 522
  • 658
0

Just replace \W with [^\w\r\n] in your regex:

(?i)(?:\b(tree)\b)[^\w\r\n]+(?:\w+[^\w\r\n]+){0,5}?\b(house)\b.*
Toto
  • 83,193
  • 59
  • 77
  • 109
0

To get the closest matches of both words on the same line, an option is to use a negative lookahead:

(?i)(\btree\b)(?>(?!(?1)).)*?\bhouse\b
  • The . dot default does not match a newline (only with s DOTALL modifier)
  • (?>(?!(?1)).)*? As few as possibly of any characters, that are not followed by \btree\b
  • (?1) pastes the first parenthesized pattern.

Example at regex101.com; Regex FAQ

Community
  • 1
  • 1
Jonny 5
  • 11,051
  • 2
  • 20
  • 42
0

Maybe this helps, found here https://www.regular-expressions.info/near.html

\bword1\W+(?:\w+\W+){1,6}?word2\b.    
didi
  • 1