3

I'm trying to write an XML-file containing CDATA-nodes using boost::property_tree. However since characters such as <, >, &, etc. are escaped automatically when writing the XML-file, something like

xml.put("node", "<![CDATA[message]]>")

will appear as

<node>&lt![CDATA[message]]&gt</node> 

in the XML-file. Is there any way to properly write CDATA-nodes using property_tree or is this simply a limitation of the library?

1 Answers1

1

Boost documentation clearly says that it is not able to distinguish between CDATA and non-CDATA values:

The XML storage encoding does not round-trip perfectly. A read-write cycle loses trimmed whitespace, low-level formatting information, and the distinction between normal data and CDATA nodes. Comments are only preserved when enabled. A write-read cycle loses trimmed whitespace; that is, if the origin tree has string data that starts or ends with whitespace, that whitespace is lost.

The few times I've faced the same problem have been for very specific cases where I knew no other escaped data would be needed, so a simple post-processing of the generated file replacing the escaped characters was enough.

As a general example:

std::ostringstream ss;
pt::write_xml(ss, xml, pt::xml_writer_make_settings<std::string>('\t', 1));

auto cleaned_xml = boost::replace_all_copy(ss.str(), "&gt;", ">");
cleaned_xml = boost::replace_all_copy(cleaned_xml, "&lt;", "<");
cleaned_xml = boost::replace_all_copy(cleaned_xml, "&amp;", "&"); // last one

std::ofstream fo(path);
fo << cleaned_xml;

A more elaborated solution should include finding the opening &lt;![CDATA[ and closing ]]&gt, and replace only within those limits to avoid replacing correctly escaped symbols.

Another solution is presented in this answer but I've never used it.

cbuchart
  • 8,748
  • 6
  • 43
  • 72