1

I'm working with a few other people editing some CSV files collaboratively which are stored in a repository on Github. We have collaborators using Windows, OS X, and Linux, so to deal with differences in line endings and local Git settings, I added a .gitattributes file with the following:

* text=auto

People are using Excel to edit these CSV files, then saving and committing. Sometimes on commit the file seems to have no line endings at all; lines are terminated by a carriage return (\r). Github then sees these files as one big line, and thinks the entire file has been deleted and replaced by one long line. Here's an example of such a commit:

https://github.com/weecology/neonetods/commit/7e10cb2913ca2e214c49944b4856519cab9bad96

If you were to check out the file, you'd see that each line ends with a \r. This has happened to two people now after simply editing and saving the file in Excel, on both Mac and Windows.

This is causing conflicts where there shouldn't be conflicts and making it hard to track the provenance of each file. Does anyone have any idea how this could be happening or how we could resolve it?

heyitsbmo
  • 1,565
  • 1
  • 11
  • 27

1 Answers1

2

This person:

git and CR vs LF (but NOT CRLF)

had a similar issue. The solution was to use a filter, which is inconvenient because everyone has to add the filter definition to .git/config, but it should solve the problem.

It took a long time to figure out the appropriate filter to use, but this:

clean = LC_CTYPE=C awk '{printf(\"%s\\n\", $0)}' | LC_CTYPE=C tr '\\r' '\\n'

...is what ended up working for everyone. It replaces \r\n with \n (so that \r\n doesn't become \n\n), then replaces \r with \n, and deals with some odd character encoding issues that result when using tr on a Mac.

Community
  • 1
  • 1
heyitsbmo
  • 1,565
  • 1
  • 11
  • 27