15

When a CSV file is generated using C# and opened in Microsoft Excel it displays  characters before special symbols e.g. £

In Notepad++ the hex value for  is: C2

So before writing the £ symbol to file, I have tried the following...

    var test = "£200.00";
    var replaced = test.Replace("\xC2", " ");

    StreamWriter outputFile = File.CreateText("testoutput.csv"); // default UTF-8
    outputFile.WriteLine(replaced);
    outputFile.Close(); 

When opening the CSV file in Excel, I still see the "Â" character before the £ symbol (hex equivalent \xC2 \xA3); It made no difference.

Do I need to use a different encoding? or am I missing something?

nimblebit
  • 303
  • 2
  • 10
  • 18
  • 1
    Did you read this question: https://stackoverflow.com/q/6002256/5311735? – Evk Feb 15 '18 at 09:50
  • Thank you @Evk. Unfortunately that particular post doesnt answer this question as I cannot control the clients Excel configuration. Also the CSV is being generated with .NET code. I'd like to resolve CSV output so it will work on any machine. – nimblebit Feb 15 '18 at 09:58
  • Well there are a lot of answers to that question, accepted one does not actually answers at all, so look at others. – Evk Feb 15 '18 at 09:59
  • Thanks @Evk i'm looking through each of the suggestions and will update the post if I find a solution – nimblebit Feb 15 '18 at 10:08
  • You might want to mention which version of Excel you're targeting. There are people with some _really_ old versions out there. – Nyerguds Feb 15 '18 at 19:05
  • I'm using Excel 2013 – nimblebit Feb 16 '18 at 17:36
  • Related: https://stackoverflow.com/questions/4414088/how-to-getbytes-in-c-sharp-with-utf8-encoding-with-bom – Dunc Feb 01 '21 at 11:20

2 Answers2

23

Thank you @Evk and @Mortalier, your suggestions lead me to the right direction...

I needed to update my StreamWriter so it would explicitly include UTF-8 BOM at the beginning http://thinkinginsoftware.blogspot.co.uk/2017/12/correctly-generate-csv-that-excel-can.html

So my code has changed from:

StreamWriter outputFile = File.CreateText("testoutput.csv"); // default UTF-8

To:

StreamWriter outputFile = new StreamWriter("testoutput.csv", false, new UTF8Encoding(true))

Or: Another solution I found here was to use a different encoding if you're only expecting latin characters... http://theoldsewingfactory.com/2010/12/05/saving-csv-files-in-utf8-creates-a-characters-in-excel/

StreamWriter outputFile = new StreamWriter("testoutput.csv", false, Encoding.GetEncoding("Windows-1252"))

My system will most likely use latin & non-latin characters so I'm using the UTF-8 BOM solution.

Final code

    var test = "£200.00";
    StreamWriter outputFile = new StreamWriter("testoutput.csv", false, new UTF8Encoding(true))
    outputFile.WriteLine(test);
    outputFile.Close();
nimblebit
  • 303
  • 2
  • 10
  • 18
  • 1
    You are relying on Excel to guess the encoding. When the first few bytes look like a BOM, it guesses by choosing the corresponding Unicode encoding. The canonical method of opening a text file is to give the program the encoding and, in the case of CSV, the header, column type, field separator, text qualifier and escape information. Excel supports all that with its text import dialog. (Of course, if you use a file type like xlsx that has all of that metadata embedded, the users don't have to supply it separately. It is very easy to create xlsx files in C# with free open-source libraries.) – Tom Blodget Feb 17 '18 at 19:27
3

I tried your code and Excel does show AŁ in the cell. Then I tried to open the csv with LibreOffice Clac. At first there too was AŁ, but on import the program will ask you about encoding. Once I chose UTF-8 the £ symbol was displayed correctly. My guess is that in fact there is an issue with your encoding.

This might help with Excel https://superuser.com/questions/280603/how-to-set-character-encoding-when-opening-excel

Mortaliar
  • 58
  • 4
  • Thank you for your response Mortaliar, however I cannot control the clients Excel configuration and set defaults as those are set by their organisation policies. The problem only manifests when it is generated with code. If it is manually created in notepad or any other application, it doesnt happen. I suspect it has something to do with the streamwriter, so I'd like to fix the problem from the code perspective. Do you have any ideas of where I could be going wrong from the code perspective? – nimblebit Feb 15 '18 at 10:03
  • 1
    Try this: https://stackoverflow.com/questions/8151379/forcing-streamwriter-to-change-encoding – Mortaliar Feb 15 '18 at 10:08
  • Thank you, I'm checking this out now :) – nimblebit Feb 15 '18 at 10:10