3

I am using regex to replace p tag if have html attributes with p tag without having attributes and regex is:

$html = preg_replace("/<p[^>]*>(.+?)<\/p>/i", "<p>$1</p>", $html);

Regex is working good if p tag have not any new line like

<p style="text-align: center;">It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout</p>

But when p tag have new line then above regex is not working. For an example

<p style="text-align: center;">It is a long established fact that a reader will be
distracted by the readable <br />
content of a page when looking at its layou</p>

So could someone suggest that what changes will be required in above regex so that they work properly if p tag have string including new lines?

Katty
  • 447
  • 4
  • 24

2 Answers2

1

If you must, use

$html = preg_replace("/<p[^>]*>(.+?)<\/p>/is", "<p>$1</p>", $html);
#                                          ^

which enables the singleline mode, aka the dot matches newline characters as well. The usual warning not to use regular expressions on HTML tags applies nevertheless.
See a demo on regex101.com.

Jan
  • 38,539
  • 8
  • 41
  • 69
0

To use a DOM parser, it's easy enough to use DOMDocument and loadHTML().

This loads the document and then uses getElementsByTagName() to select all of the <p> tags. Then for each tag it finds, it checks if it has attributes and removes them if needed...

$doc = new DOMDocument();
$doc->loadHTML($html);

$pTags = $doc->getElementsByTagName("p");
foreach ( $pTags as $p )    {
    if ( $p->hasAttributes() )  {
        foreach ( $p->attributes as $attribute )    {
            $p->removeAttribute($attribute->nodeName );
        }
    }
}

// Save/echo the resultant HTML
echo $doc->saveHTML();
Nigel Ren
  • 51,875
  • 11
  • 34
  • 49