63

I tried this aproach without any success

the code I'm using:

// File name
String filename = String.Format("{0:ddMMyyHHmm}", dtFileCreated);
String filePath = Path.Combine(Server.MapPath("App_Data"), filename + ".txt");

// Process       
myObject pbs = new myObject();         
pbs.GenerateFile();

// pbs.GeneratedFile is a StringBuilder object

// Save file
Encoding utf8WithoutBom = new UTF8Encoding(true);
TextWriter tw = new StreamWriter(filePath, false, utf8WithoutBom);
foreach (string s in pbs.GeneratedFile.ToArray()) 
    tw.WriteLine(s);
tw.Close();

// Push Generated File into Client
Response.Clear();
Response.ContentType = "application/vnd.text";
Response.AppendHeader("Content-Disposition", "attachment; filename=" + filename + ".txt");
Response.TransmitFile(filePath);
Response.End();

the result:

enter image description here

It's writing the BOM no matter what, and special chars (like Æ Ø Å) are not correct :-/

I'm stuck!

My objective is create a file using UTF-8 as Encoding and 8859-1 as CharSet

Is this so hard to accomplish or I'm just getting a bad day?

All help is greatly appreciated, thank you!

Community
  • 1
  • 1
balexandre
  • 69,002
  • 44
  • 219
  • 321
  • 6
    "a file using UTF-8 as Encoding and 8859-1 as CharSet" : encoding and charset are the same thing, so your requirement doesn't make sense... – Thomas Levesque Mar 23 '10 at 19:49

1 Answers1

157

Well it writes the BOM because you are instructing it to, in the line

Encoding utf8WithoutBom = new UTF8Encoding(true);

true means that the BOM should be emitted, using

Encoding utf8WithoutBom = new UTF8Encoding(false);

writes no BOM.

My objective is create a file using UTF-8 as Encoding and 8859-1 as CharSet

Sadly, this is not possible, either you write UTF-8 or not. I.e. as long as the characters you are writing are present in ISO Latin-1 it will look like a ISO 8859-1 file, however as soon as you output a character that is not covered by ISO 8859-1 (e.g. ä,ö, ü) these characters will be written as a multibyte character.

To write true ISO-8859-1 use:

Encoding isoLatin1Encoding = Encoding.GetEncoding("ISO-8859-1");

Edit: After balexandre's comment

I used the following code for testing ...

var filePath = @"c:\temp\test.txt";
var sb = new StringBuilder();
sb.Append("dsfaskd jlsadfj laskjdflasjdf asdkfjalksjdf lkjdsfljas dddd jflasjdflkjasdlfkjasldfl asääääjdflkaslj d f");

Encoding isoLatin1Encoding = Encoding.GetEncoding("ISO-8859-1");

TextWriter tw = new StreamWriter(filePath, false, isoLatin1Encoding);
tw.WriteLine(sb.ToString());
tw.Close();

And the file looks perfectly well. Obviously, you should use the same encoding when reading the file.

Patrick McDonald
  • 59,808
  • 14
  • 95
  • 115
AxelEckenberger
  • 15,758
  • 3
  • 45
  • 67
  • 6
    MSDN says EMIT ... and I kept reading OMIT arghh!!! I tried: Encoding.GetEncoding("ISO-8859-1") and does not write the BOM, still have trouble regarding special chars though :( – balexandre Mar 23 '10 at 19:46
  • @balexandre: I read Ømit. You are forgetting to make the HttpResponse.Charset property match the encoding of the file. Setting them both to UTF-8 is rather a good idea. – Hans Passant Mar 23 '10 at 20:22
  • @Thomas Levesque I downvoted by mistake ... (to much clicks on my hand today! and ... no confirmation message on downvotes) :-/ my mistake though! I quickly upvoted to +1 – balexandre Mar 23 '10 at 23:17
  • @nobugz HttpResponse has nothing to do with the file, the file is already written, HttpResponse is only the sending to the client part – balexandre Mar 23 '10 at 23:18
  • @balexandre: it does, TransmitFile makes it part of the response. All text in the response must have the same encoding. – Hans Passant Mar 24 '10 at 05:20