0

I am retrieving daily logs from a system in the format of a .csv file. I want to import these to my database. The issue is that, for some reason, these .csv files seem to lack CRLF line endings (it's being replaced by LF). If I open the file in Notepad++, all the line endings are LF - this won't work because when uploading I split the file into the database by using CRLF to determine rows and LF to determine columns.

If I open the .csv file in Excel, and save it as a .csv file, the CRLF gets added and I can upload it fine to the database.

I can't understand how when I read the while with notepad++, all the endings are LF, however Excel can somehow differentiate which ones are really CRLF and which ones are LF?

The file is 4-8MB so it takes a while for Excel to open it (+-30sec?), so I was wondering if anyone knows what the problem is and if there's any way other way to solve it rather than opening it in Excel or a CSV editor and save it again. It's very unnecessary work for an otherwise automated process that I'd love to get rid of.

Thanks!

Correct format, after opening and saving with Excel

The csv file after downloading, without opening and saving again

sebjsv
  • 1
  • 2
  • 1
    Run `unix2dos` on the file? – Alan Birtles Sep 11 '20 at 07:08
  • @AlanBirtles Hi, thank you for your answer. I am using a work computer right now running Windows, and I cannot install any unauthorized software. Is there any way to do something similar on Windows? I could try it on my personal computer to see if it works though – sebjsv Sep 11 '20 at 07:10
  • I ran unix2dos, however it changed all the LF to CRLF making the file equally unusable in my situation... I really don't understand how Excel differentiates it – sebjsv Sep 11 '20 at 07:44
  • An example file might help understand the issue – Alan Birtles Sep 11 '20 at 08:25

1 Answers1

0

It's a bit hard to tell, but if the question is: "why can excel open these", the answer is likely simple: Because they designed it to.

There's lots of incorrect text/csv generated across all software, so perhaps the Excel engineers simply decided that it was worth parsing these, even if it's wrong. It's not hard to write software that can recognize both \n and \r\n, that said... if you are generating CSV, you should use \r\n for the software that does care about line endings.

Evert
  • 75,014
  • 17
  • 95
  • 156
  • Ah, that makes sense from Excel's perspective. Unfortunately I'm not the one generating these files so I have to work with what I've got. The weird part is that when getting the file it seems to all be LF LF LF (times thousands of lines), so if Excel can somehow differentiate it, should I not be able to in some other way? I updated my title to better reflect my issue. – sebjsv Sep 11 '20 at 07:12
  • Given csv isn't really standardised you can't say that `LF` is "wrong" it's at best unconventional – Alan Birtles Sep 11 '20 at 07:14
  • If your question is 'can I fix these files without excel', there's a lot of CLI tools for fixing line-endings that are likely only 1 google search away. – Evert Sep 11 '20 at 07:18