1

I am trying to create a CSV file from my Java code.

    File file = File.createTempFile("DummyReport", ".csv");
    SomeListofObjects items = getSomeList();
    FileUtils.write(file, "ID;CREATION;" + System.lineSeparator());
    FileUtils.writeLines(file, activities.getItems(), true);        
    return file;

I am facing some issue with special chars.

When I debug the code, I found that I have a character as "ö". But in the csv file generated, it is coming weirdly "ö".

Can we set this in FileUtile or File? Can some one help me to solve this?

Duncan Jones
  • 59,308
  • 24
  • 169
  • 227
Patan
  • 14,105
  • 33
  • 107
  • 185

2 Answers2

4

First check if you are using a text viewer that displays your output correctly. If not, the problem might be your system encoding.

FileUtils.write(file, string) uses the default system encoding, which in your system seems to be 8bit. The "ö" character however is encoded as two bytes, resulting in "ö.".

Use FileUtils.write(File file, CharSequence data, String encoding) instead, with an appropriate encoding:

  • ISO 8859-1 (8bit standard, Latin-1)
  • CP1252 (8bit proprietary, Windows default, extends Latin 1)
  • MacRoman (8bit proprietary, Apple default)
  • UTF-8 (16bit standard, Linux default)
  • Latin-15 (not always supported)

My suggestion is to use FileUtils.write(file, string, "UTF-8").

sina72
  • 4,525
  • 2
  • 32
  • 36
  • when I open this with notepad++, I do get correct chars. So I think it is with system encoding. Thanks for the suggestion. – Patan May 21 '14 at 12:59
1

You do not specify an encoding when you write to your file.

The result of which is that the default encoding is used.

It appears however that you use UTF-8, and unfortunately, you use Excel.

And Excel cannot read UTF-8 until you prepend the file with a BOM... Which no other program requires.

So, you have two choices:

  • keep doing what you are doing and to hell with Excel;
  • prepend a BOM to the file and make the file unreadable with other programs!

Also, if you are using Java 7+, useFiles.write() instead.

Another solution would of course to use ISO as an encoding, but... Well, that's your choice.

fge
  • 110,072
  • 26
  • 223
  • 312
  • Excel can read UTF-8, see http://stackoverflow.com/questions/6002256/is-it-possible-to-force-excel-recognize-utf-8-csv-files-automatically – sina72 May 21 '14 at 13:45
  • @sina72 not if you don't use a BOM – fge May 21 '14 at 14:12
  • @fge. Can you help me how to prepend the BOM here to fix this. – Patan May 21 '14 at 14:17
  • 1
    @Patan before writing your lines you should write char '\ufeff'; but don't forget to set your character coding to UTF-8. – fge May 21 '14 at 14:28
  • Thanks. As you said there would be problems with other readers. I am thinking to use ISO encoding. Do you think it is better. – Patan May 21 '14 at 14:41
  • 1
    @Patan that's a tough call; really, in 2014, you should be using UTF-8 all around, but MS Office basically prevents progress from taking place here; it really depends on your use case! – fge May 21 '14 at 15:18