-1

Possible Duplicate:
How to parse and process HTML with PHP?

I'm not very good with regex, but I found this code:

<?php
$string = "some text (a(b(c)d)e) more text";
if(preg_match("/\((?>[^()]+|(?R))*\)/",$string,$matches))
{
    echo "<pre>"; print_r($matches); echo "</pre>";
}
?>

And I'm trying to change the regex pattern to match opening and closing html tags instead of parenthesis, but I cant figure out how to mimic "[^()]+" so that it matches tags instead of parenthesis.

The purpose of this would be to allow me to make a new html tag, whose contents I can access regardless of how many times the tag is nested within itself.
Thank you.

Community
  • 1
  • 1
Max
  • 59
  • 2
  • 7

1 Answers1

0

[^()] defines character class. ^ means "everything but following characters". So your example can be interpreted as everything except brackets.

If you're parsing content of html tag you require [^<>]+.

If you have content like <div>Blah <a>foo</a>bar</div> and you want to match Blah <a>foo</a>bar you should use regexp like ~<div>(.+?)</div>~

? after quantifier is called greedy killer and it'll make sure regexp "stops eating" when it encouters </div

Anyway... You should rather use DOM and xPath::query() when parsing HTML. Here's some random tutorial from google.

Vyktor
  • 19,006
  • 5
  • 53
  • 93
  • the example with
    Blah foobar
    would suit my needs better if it could accurately parse
    Blah
    foo
    bar
    but thanks for the info about DOM, ill look into it
    – Max Feb 11 '12 at 14:48
  • @Max DOM will have better performance and all... I may add example of parsing it if you want to... – Vyktor Feb 11 '12 at 14:50