0

Using a CFile object, the code opens a file with "ANSI as UTF8" encoding. Once the text is modified and the file is overwritten using the write function, the encode changes to ANSI.

Tried the following code to change the encoding of the text being written.

CComBSTR bstrContent;
m_spDOMDocument->get_xml(&bstrContent);
CString strContent(bstrContent);
CT2CA outputString(strContent, CP_UTF8);

File.SetLength(0);
File.SeekToBegin();
File.Write(strContent, ::strlen(outputString));
File.SeekToBegin();

This is as suggested in UTF-8, CString and CFile? (C++, MFC)

The file still gets written with ANSI encoding. How can the file be written in UTF 8 format?

Community
  • 1
  • 1
Arun M
  • 25
  • 1
  • 5

1 Answers1

0

You're not using CT2CA properly, and anyway just convert straight from the wide BSTR using CW2CA.

Try this:

CStringA outputString( CW2CA(bstrContent, CP_UTF8))

Then write the char based outputString to the File not the strContent (which I can't tell if that's wide or not).

Also you're not writing the BOM (Byte-Object-Marker), which may cause problems with some editors. Have a look at this: What's different between UTF-8 and UTF-8 without BOM?

Community
  • 1
  • 1
snowdude
  • 3,764
  • 1
  • 16
  • 26
  • Thanks for the quick reply. I tried using CStringA outputString( CT2CA(strContent, CP_UTF8)) It still does not write in UTF-8 encoding format. How do I write with BOM? (I am new to this...my questions may be naive!) – Arun M Jan 22 '14 at 14:01
  • I was able to add the BOM with the following code char BOM[3]={0xEF, 0xBB, 0xBF}; File.Write(BOM,3); It creates an UTF8 file. Trying to load this file using the IXMLDmoDocument2Ptr object fails. Only ANSI or "ANSI as UTF-8" files are getting loaded this way. Converted the file to "ANSI as UTF-8" encoding and the file loads without any issues – Arun M Jan 22 '14 at 14:16
  • Did you try CW2CA? Also Are you compiling with Unicode enabled, ie are your CStrings wchar_t based? – snowdude Jan 22 '14 at 14:26
  • BTW: BOM means "Byte-Order-Marker" and not "Byte-Object-Marker". – Jabberwocky Jan 23 '14 at 08:39