Regex for new line within html tag

Question

I am using regex to replace p tag if have html attributes with p tag without having attributes and regex is:

$html = preg_replace("/<p[^>]*>(.+?)<\/p>/i", "<p>$1</p>", $html);

Regex is working good if p tag have not any new line like

<p style="text-align: center;">It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout</p>

But when p tag have new line then above regex is not working. For an example

<p style="text-align: center;">It is a long established fact that a reader will be
distracted by the readable <br />
content of a page when looking at its layou</p>

So could someone suggest that what changes will be required in above regex so that they work properly if p tag have string including new lines?

Better not to use a regular expression for this. https://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php — CertainPerformance, May 09 '19 at 06:30
@CertainPerformance Thank you for suggestion, I will do the same for the future. — Katty, May 09 '19 at 06:50

score 1 · Accepted Answer · answered May 09 '19 at 06:33

If you must, use

$html = preg_replace("/<p[^>]*>(.+?)<\/p>/is", "<p>$1</p>", $html);
#                                          ^

which enables the singleline mode, aka the dot matches newline characters as well. The usual warning not to use regular expressions on HTML tags applies nevertheless.
See a demo on regex101.com.

score 0 · Answer 2 · answered May 09 '19 at 06:42

To use a DOM parser, it's easy enough to use DOMDocument and loadHTML().

This loads the document and then uses getElementsByTagName() to select all of the <p> tags. Then for each tag it finds, it checks if it has attributes and removes them if needed...

$doc = new DOMDocument();
$doc->loadHTML($html);

$pTags = $doc->getElementsByTagName("p");
foreach ( $pTags as $p )    {
    if ( $p->hasAttributes() )  {
        foreach ( $p->attributes as $attribute )    {
            $p->removeAttribute($attribute->nodeName );
        }
    }
}

// Save/echo the resultant HTML
echo $doc->saveHTML();

Regex for new line within html tag

2 Answers2