0

I came across this weird situation:

I need to save a dataframe to a .csv file UTF-8 and with a LF ending. I'm using the latest version of R and Rstudio on a Windows 10 machine.

My first attempt was to do naively:

write.csv(df, fileEncoding="UTF-8", eol="\n")

checking with Notepad++, it appears the encoding is UTF-8, however the line ending is CRLF and not LF. Ok, let's double check with Notepad: surprise, surprise, the encoding, according to Notepad, is ANSI. At this point I'm confused.

After looking at the docs for the function write.csv I read that:

CSV files do not record an encoding

I'm not an expert on the topic, so I decide to revert back and save the file as a simple .txt using write.table as follows:

write.table(df, fileEncoding="UTF-8", eol="\n")

again, the same result as above. No changes whatsoever. I tried the combinations

write.csv(df)
write.table(df)

without specified encodings but no change. Then I set the default encoding in Rstudio to be UTF-8 and LF line ending (as in the picture below)

enter image description here

and ran the tests again. No change. What am I missing??

mickkk
  • 1,072
  • 2
  • 11
  • 34

1 Answers1

1

This is an odd one, at least for me. Nonetheless, by reading the docs of write.table I found the solution. Apparently on Windows, to save files Unix-style you have to open a binary connection to a file and then save the file using the desired eol:

f <- file("filename.csv", "wb")
write.csv(df, file=f, eol="\n")
close(f)

As far as the UTF-8 format is concerned, global settings should work fine.

Check that the eol is LF using Notepad++. UTF-8 is harder to check since on Linux isutf8 (from moreutils) says files are indeed UTF-8 but Windows' Notepad disagrees when saving and says they are ANSI.

mickkk
  • 1,072
  • 2
  • 11
  • 34