0

I have a csv file on the server encoded as ANSI, and want to open it, process and save the content on the database. I'm having infinite problems with the accented character as "à è ì ò ù", getting instead "?".

The content in the html header is set as utf-8. This is my code

Response.CharSet = "UTF-8"

...

Set objStream = CreateObject("ADODB.Stream")
objStream.Type = 2
objStream.CharSet = "utf-8"
objStream.Open
objStream.Position = 0
objStream.LoadFromFile( path )

strData = objStream.ReadText()
Response.write(strData) '<== just to see

objStream.Close
Set objStream = Nothing

At first I was using a single FileSystem object, but read that it has even more problems handling encoding

Fehu
  • 341
  • 2
  • 15
  • Maybe [this](http://stackoverflow.com/a/25685355/2861476) could help you – MC ND Mar 24 '15 at 22:51
  • I've tried it. Saves on the server a file perfectly encoded in utf8, with all the characters readable, but if I read it with the above code it gives me again results as "Corposit�". If I change the parameters in objStream.CharSet = "unicode" it give me a long line of "????????????" – Fehu Mar 25 '15 at 12:45
  • Where do you get the wrong data, in the database or in the `Response` output? – MC ND Mar 25 '15 at 14:36
  • directly in the response – Fehu Mar 25 '15 at 15:11
  • Then you will need something more than the posted code (maybe you are using it, but i don't know). See [here](http://blog.inspired.no/utf-8-with-asp-71/) – MC ND Mar 25 '15 at 15:16

1 Answers1

1

Internally, VBScript strings are UTF-16 encoded. IO functions that read must be told/assume per default the correct source encoding to convert the source into UTF-16. IO function that write must be told/assume per default the desired output encoding to convert UTF-16 into that desired encoding.

If your file is really (some kind of) ANSI then your

objStream.CharSet = "utf-8"

is wrong. It should be the name of the encoding (cpXXX, ISO_YYY, ZZZ) that your file really uses.

Did you test using the FileSystemObject? Maybe it will guess right and your problem is solved without extra effort.

Ekkehard.Horner
  • 37,203
  • 2
  • 36
  • 83
  • I've tried FileSystemObject first but generates artifacts on the same characters. I have not very much control over the source file, but what I'm using now is csv generated with excel 2010 from an excel table. I know that it's ansi only because notepad++ says so, I did not even know that there are multiple ansi standards :S – Fehu Mar 25 '15 at 10:01