1

Why does encoding a string to byte[] with StreamWriter and UTF8.GetBytes produce different results?:

string value = "myTestValue";

byte[] data = Encoding.UTF8.GetBytes(value);
byte[] streamedData;
using (var memoryStream = new MemoryStream())
using (var streamWriter = new StreamWriter(memoryStream, Encoding.UTF8))
{
    streamWriter.Write(value);
    streamWriter.Flush();
    streamedData = memoryStream.ToArray();
}

//false
data.SequenceEqual(streamedData);

1 Answers1

2

It's all about the BOM and the way Encoding.GetBytes() is implemented.

The static object Encoding.UTF8 is initialized to include BOM as you can see in

Encoding.UTF8.GetPreamble();

As a result StreamWriter correctly writes it to the given Stream object (with BOM). However Encoding.GetBytes() never emits BOM; even if you construct the UTF8Encoding object to do so:

byte[] withoutBom = new UTF8Encoding(false).GetBytes(value);
byte[] withBom = new UTF8Encoding(true).GetBytes(value);

// true
withoutBom.SequenceEqual(withBom);

If you want the StreamWriter to encode without BOM you can initialize it like:

new StreamWriter(stream, new UTF8Encoding(false)

This way both binaries will be equal.