0

I just want to make my blog as easy to code as it can be. And my question is:

How(if it's possible) to encode everything inside HTML tag <code> by htmlentities();

I'm want this: If I make a post about making something, I will don't need to encode it by some encoder online but simply make something like

"Just simply put
<code>
encoded code
</code>
and this <b>bold</b> text will be bold, because it isn't inside <code>

is it possible inside php code with some function used to be like

encode_tags($text,"<code>","</code>");

?

Dan B.
  • 33
  • 1
  • 1
  • 9
  • Use an HTML parser and selectively encode parts. – tadman May 14 '18 at 17:30
  • @tadman I'm newbie inside selectively encoding parts. I don't know how to do that. That's like a question. – Dan B. May 14 '18 at 17:36
  • Step one should be visiting [Composer](https://getcomposer.org) and seeing what tools are available to solve this problem. There's a multitude of HTML parsers there, some easy to use, some much more flexible, that you can pick from. The important thing is to know what options you have, because in the PHP world there's usually a lot of them. – tadman May 14 '18 at 17:37
  • @tadman do you have any preffered parser? I will try to use the most downloaded one. But thanks for your answer. – Dan B. May 14 '18 at 17:54
  • What works for me might be too complicated or too simple for you, so I'm hesitant to make any specific recommendations. Go with what feels best. They all do similar things. – tadman May 14 '18 at 17:56
  • I want it as simple as possible. I just want - as said in question - to html encode text between , because I don't want to show it as html, but as plain text (it's what htmlspecialchars() do) – Dan B. May 15 '18 at 14:51
  • 2
    You can't really use an HTML parser for this: The goal is to *not* prase the contents of `` tags as HTML. (It's basically broken input which is always a nightmare to deal with and requires lots of heuristics). – Quentin May 15 '18 at 14:53
  • yeah... I don't want to parse it because in most cases, it's PHP code.... – Dan B. May 15 '18 at 14:55

1 Answers1

0

Your input string ( minorly edited to clarify my answer):

$string = "Just simply put
<code>
<p>encoded code</p>
</code>
and this <b>bold</b> text will be bold, 
because it isn't inside <code><b>code tags</b></code>";

Step 1:
Break your string into parts as surrounded by by <code>. Note your regex should use # not / as the delimiters so you don't need to care about the / in your </code>.

 preg_match_all("#<code>(.*?)</code>#is", $string, $codes);

Note the s at the end of the REGEX for ignoring line breaks on the group (*).

The above code is lazy (see links at the bottom) and will also not match incomplete tags (such as <code> with no corresponding </code>).

Step 2:
Make the HTML changes as required to each of the found substrings (You should be familiar with how preg_match_all returns the data from the function, see link at the bottom):

$replace = [];
foreach($codes[1] as $key=>$codeBlock ){
    $replace[$key] = htmlentities($codeBlock, ENT_QUOTES, "UTF-8", false);
}
unset($key, $codeBlock);

Step 3:
Apply the changes to the original values (theses are NOT the same as the converted values, used in Step 2):

foreach($codes[0] as $key=>$replacer){
    $string = str_replace($replacer, $replace[$key], $string);
}
unset($key, $replacer, $replace);

Output:

The above will then output:

Just simply put

<p>encoded code</p>

and this bold text will be bold, because it isn't inside <b>code tags</b>


You should have a familiar understanding of preg_match_* family of PHP functions as well as general PCRE REGEX.

Please also read this here, and here and read this and especially this.

Cheers

Martin
  • 19,815
  • 6
  • 53
  • 104
  • @DanB. of course; simply replace the second foreach `$codes[0]` with `$codes[1]`. An uptick next to my answer would be good too, thanks `;-)` – Martin May 15 '18 at 16:34