2

Looking to fix an issue where I now have 2 directory paths with the same name just differs in capitalization. Because, I renamed the directory to lowercase, once upon a time.

I see only one folder/directory listed in folder view, giving no other files & folders in the root. "someproject"/"src"

However, when making changes I see 2 sets of files to commit.

Src/somefile.txt
src/somefile.txt

How do I remove Src folder and contents from the repo?

  • Would I backup "src", remove the "src" folder, commit, then add it back in, commit, push?
Jason Foglia
  • 1,894
  • 3
  • 23
  • 45

2 Answers2

2

To remove the directory from git, but not filesystem

git rm -r --cached folderNameToRemove

Add the directory to .gitignore, and remember to do a git push after that.

For similar scenarios, checkout this brilliant thread

nobalG
  • 4,798
  • 3
  • 29
  • 63
  • Actually, both directories don't exist on disk, just the correct one. When making a change to a single file, changes showed in two locations from git status, showing both the wrong and right locations. When cloned however, will create two directories on disk, which is what I was trying to avoid. Thank you again. – Jason Foglia Jan 11 '21 at 16:28
1

nobalG's answer is good

If you run:

git rm -r --cached src
git add src

you should be good to go (confirm with git status). You don't need to add anything to .gitignore. You may also not actually need to git add anything.

The problem with this answer is that it's just a recipe; it tells you nothing about what to do the next time you have various problems.

Long

First, let's define the problem properly. We need to note the following:

  • Git stores commits, not files. That is, the unit of storage you deal with, when you run git checkout or git commit, is the commit. It's true that commits contain files, but this is a package deal. (There are some somewhat-klunky ways to work piecemeal, which we'll get to in a bit; they're useful for special cases.)

  • All parts of any commit are read-only. That includes all the stored files. They are stored in a frozen-for-all-time, compressed format. The contents of each file are also de-duplicated against every other stored file. This is important because every commit stores a full copy of every file, but with the de-duplication, that means that most commits mostly share almost all of their files with every other commit. The fact that the files are read-only enables this: it's literally impossible to change a stored file, so if file F1 of commit X needs the same content as file F2 of commit Y, it's perfectly reasonable for them to actually use the same stored content.

  • The files you see and work on/with are not in Git. The files that are in Git are not the ones that you see and work on/with. This is a direct consequence of the previous item. The stored files, inside any given commit, are in a form that nothing but Git can read, and literally nothing—not even Git itself—can overwrite them, either partly or completely. This means that the files that are in Git are entirely useless for getting any actual work done. So Git doesn't try to do that. When you pick out a commit to work on/with, Git copies the files out of the commit into a work area, which we call your working tree or work-tree.

Now that we understand that the files you can see and work with aren't in Git, and that the files that are in Git are in a special Git-only form, now we can understand what's going on here. When Git is storing files, these files are not in folders and can have almost-arbitrary names. In a Git commit, we don't have a folder named src or Src storing a file named somefile.txt. Instead, we just have a file named src/somefile.txt, complete with a slash in it.1 And—this is the key to our problem here—these names are always case sensitive, so that a commit can easily contain both src/somefile.txt and Src/somefile.txt, as two separate files. A commit can also contain both readme.md and README.md, and so on.

Again, these are not ordinary files. They're stored in a special, read-only, Git-only, compressed and de-duplicated format, that only Git can read (and nothing can write at all, except for one initial creation to make a new commit). And, they are case-sensitive, and can spell out file names that you literally can't use on a Windows machine.2

If you're on a typical Mac or Windows system, your regular files are case-preserving but case-insensitive. This is actually file-system-specific, and it's easy to set up a macOS mountable disk in which files are case-sensitive, so that you can store both readme.md and README.md. See my answer here for a way to do that on macOS. On Windows, you could set up a VM and run Linux; I don't use Windows so I have no canned procedure for this.


1Technically, the committed versions of files use a folder-like structure internally; the place that flattens away the folders is Git's index. But you can't work directly with the commits anyway: you read existing commits into Git's index, and build up new commits in Git's index, and here the file names literally include the slashes. So we might as well just say that file names have slashes in them.

2Mac users can laugh at Windows users who can't create files named aux.txt, because it's easy to create an aux.txt on a Mac. But Mac users have other problems: for instance, there are two ways to spell the file name agréable and only one of them works on the Mac. While that's sometimes a good thing—we might not want to have two different files that seem to have the same name—it comes with exactly the same kind of interoperability issues.


Setting up a problem case on macOS

Since I have a Mac handy here, I can illustrate how we set up a problem case. We start with a new, totally-empty repository:

sh-3.2$ cd ~/tmp && mkdir case && cd case && git init
Initialized empty Git repository in /Users/torek/tmp/case/.git/
sh-3.2$ mkdir Dir && touch Dir/file && giit add Dir/file && git commit -m initial
[master (root-commit) 07ed87d] initial
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 Dir/file
sh-3.2$ mv Dir dir
sh-3.2$ echo some contents > dir/file

Now, so far all we've done in Git itself is created an initial empty commit that has a file named Dir/file. The last two commands, however, told macOS to modify our working area—our work-tree—to rename Dir, initial-uppercase, to dir, lowercase, and put some contents into the given file. Running ls shows that the OS did in fact rename the work-tree directory, but Git knows that on this file system, dir/file and Dir/file both work to read or write the same ordinary file. So:

sh-3.2$ ls
dir
sh-3.2$ cat dir/file
some contents
sh-3.2$ cat Dir/file
some contents

You can see the updates we made. But:

sh-3.2$ git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

        modified:   Dir/file

no changes added to commit (use "git add" and/or "git commit -a")

Git knows that both dir/file and Dir/file work to access the same file. Moreover, Git knows that our last commit had an empty Dir/file in it. Git assumes, because of this, that we'd like to keep the previous spelling of the file's name: Dir/file. (Note that the name is complete with the slash in it: there are no folders, just files.)

Diving deeper: Git's index and core.ignorecase

There are multiple intertwining pieces involved here:

  • First, when we check out some commit, Git builds up a data structure that Git calls, variously, the index, or the staging area, or—rarely these days—the cache. These three names reflect how important this thing is, or the fact that "the index" is not a very good name, or perhaps both.

    I like to summarize Git's index as your proposed next commit. That's its role as the staging area. But it's not entirely complete, as we'll see in a moment. Still, the key idea here is that when you first check out some particular commit, Git copies all the committed files to its index.2 The index—the proposed next commit—now matches the current commit. If and when you make changes, you have to get Git to copy the updated file(s) into the index, and that's where git add comes in.

    Remember, the index has the file's full name, complete with forward slashes. This is all kept in data files—currently .git/index and perhaps some additional files in the .git directory—that are not being managed by your OS, but rather by Git. So they are up to Git, and Git says that these file names are case sensitive, no matter what else is going on. The index can therefore store both readme.md and README.md, or both dir/file and Dir/file.

  • Meanwhile, your work-tree or working tree has in it all your files, expanded out to usable form. But if your OS treats these as case-insensitive, you won't be able to store both readme.md and README.md here. You won't be able to store both dir/file and Dir/file. Your OS treats dir and Dir as a single folder name, and both readme.md and README.md as a single file name. So the fact that Git's commits can hold two names here, but your OS's file system can't, causes our problems.

  • Last, when Git sets up the Git repository—at git init time in our example above, for instance—Git probes your file system to see whether it ignores case. If your OS does ignore case, so that this problem can occur, Git sets core.ignorecase to true:

    sh-3.2$ git config --get core.ignorecase
    true
    

    Git uses this to "know" that Dir/file and dir/file, while being different to Git, are the same to your host OS's file system.


2What's actually inside the index is:

  • the file's name, complete with forward slashes;
  • the file's mode, normally either 100644 (read/write) or 100755 (read/write but also executable);
  • a blob hash ID, which acts like a link to the file's content, already de-duplicated and in the internal Git format; and
  • flags and other internal cache data to help Git keep track of what's in your work-tree.

Since the content is actually elsewhere, the index doesn't hold a true copy of the file. However, the copy management is all automatic. This means that the index copy acts like a copy of the file, without taking any disk space if possible (if the content duplicates some existing committed file).

The end result of all of this is that the index "copy" of each file is ready to go into the next commit. In other words, what's in the index is what's staged. But if you have ten thousand files, listing them all would be painful and boring when 9997 of them are the same as what's already committed in the current commit. So git status doesn't tell you about the 9997 identical files. It just says that the three that don't match are "staged for commit". In fact, all ten thousand are staged; we just keep git status usable by not saying anything about the matching ones.


How Git uses core.ignorecase

So, in our setup above, we:

  1. Created a new empty repository, on a Mac or Windows case-insensitive system, where we can't have both README.md and readme.md files, and we can't have both Dir/file and dir/file. If we have a directory named Dir (uppercase) and we try to create dir/another, the system will create Dir/another instead.

    This is not Git's problem, but Git does have to deal with it.

  2. Created and committed an empty Dir/file. This is now inside a commit. It can never be changed! That commit, as long as it exists, has that path name in it, with that particular casing.

  3. Used some OS-side tools (plain mv on macOS, I'm not sure what you would use on Windows) to rename Dir to dir.

Because of the case-insensitive nature of the host file system, Git itself set core.ignorecase to true when we created the repository. We can now lie to Git and say that our OS treats these as different cases. This isn't true! But we can actually lie to Git, on purpose, temporarily. If things go wrong, we'll be responsible: don't do this unless you're willing to take the responsibility.

sh-3.2$ echo some contents > dir/file
sh-3.2$ git status --short
 M Dir/file
sh-3.2$ git config core.ignorecase false
sh-3.2$ git status --short -uall
 M Dir/file
?? dir/file

(the first line is a repeat, but because I used >, I can repeat it as much as I like—it erases the file and puts one line reading some contents into it).

Here, I've used git status --short to shorten the output. The M indicates that the work-tree copy has been modified. Before lying to Git, Git checks and sees that dir/file exists, figures out that this matches the index Dir/file, and matches them up. Once I lie to Git, though, something interesting happens:

  • Git believes there's a Dir/file. It tries opening that file, assuming that it won't get dir/file. But it does get dir/file. It compares this dir/file to the one it thinks it got—Dir/file—and sees that it's different. So now Git says that the file is modified (state M in the work-tree).

  • Git spots the directory / folder named dir, reads it, and finds a file named file. The path dir/file is not in Git's index, so Git says that this file is untracked (that's the double question marks).

I can now git add dir/file:

sh-3.2$ git add dir/file
sh-3.2$ git status --short
 M Dir/file
A  dir/file

Git copied dir/file into its index, using the lowercase name. This copy of the file has the contents some contents. The git status output now shows it as a new file. That's because the current commit does not have a dir/file; it has only a Dir/file (with different contents).

I can now make a new commit. This new commit contains both Dir/file—which is still empty—and dir/file, with contents some contents:

sh-3.2$ git commit -m 'add lowercase version with content'
[master 793d32c] add lowercase version with content
 1 file changed, 1 insertion(+)
 create mode 100644 dir/file
sh-3.2$ git show HEAD:Dir/file
sh-3.2$ git show HEAD:dir/file
some contents
sh-3.2$ git ls-tree -r HEAD
100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391    Dir/file
100644 blob f70d6b139823ab30278db23bb547c61e0d4444fb    dir/file

The blob lines here are Git's way of saying that there are two files. The modes of the two files are 100644 (read/write but not execute) and the big ugly hash IDs are how Git de-duplicates the contents; the file's names appear on the right. The -r option to git ls-tree is required to get it to show what's in each "folder" (see also footnote 1: if it weren't for the fact that the index flattens away folders, Git would be able to store an empty folder, but because the index does its thing, Git can't).

Now that I have made this commit, this frozen state—with two different name-cases—exists forever, or at least, as long as this second commit continues to exist. I was able to do this, even on a case-insensitive Mac file system, through the trick of lying to Git.

I can now do one last trick before I restore core.ignorecase:

sh-3.2$ git rm --cached Dir/file
rm 'Dir/file'
sh-3.2$ git status --short
D  Dir/file
sh-3.2$ ls
dir
sh-3.2$ ls dir
file

I actually could have restored core.ignorecase already, as git rm --cached doesn't need to do fancy case tricks: it's always removing entries directly from Git's index, and those entries are always case sensitive anyway. But I thought I'd show it here like this. Let's put core.ignorecase back the way it should be, and commit the result:

sh-3.2$ git config core.ignorecase true
sh-3.2$ git commit -m 'remove uppercase Dir/file'
[master c5edc17] remove uppercase Dir/file
 1 file changed, 0 insertions(+), 0 deletions(-)
 delete mode 100644 Dir/file
sh-3.2$ git status
On branch master
nothing to commit, working tree clean

I've made a corrected commit, with the contents I want, and now I have a commit that can be used correctly on this Mac, even in a case-insensitive file system. The middle commit, that has both Dir/file and dir/file in it, can't be used correctly on a typical Windows or Mac setup—but because Git actually works with the index when making new commits, we can, with careful trickery, work with such a repository. It's not fun, and better tooling would probably be good, but this shows how you can get past some problems.

Note that while you have core.ignorecase turned off, you are responsible for getting the file name case right. Git will think—incorrectly—that creating dir/file will create dir/file, even if a folder named Dir/ exists. You'll have to manually rm -r the entire Dir directory, or rename it out of the way:

mv Dir save
git checkout -- dir/file

for instance—so that when Git goes to create dir, the OS doesn't just blithely use the existing Dir/ directory instead.

It's definitely a good idea to turn core.ignorecase back on (assuming it was on initially: use git config --get to find out) once you're done with this sort of manual, careful, one-file-at-a-time digging-about.

Note that:

git checkout <commit> -- <path>

tells Git to extract the given <path> from some specific commit by:

  1. copying that into Git's index: with core.ignorecase off, Git will think it's fine to have some similar name with a different case;
  2. copying the resulting index file out to your work-tree; this is where you have the responsibility for making sure that any existing folders have the right case.

This is what I meant above by "piecemeal", way at the top of the long section. The form:

git checkout -- <path>

tells Git to extract the given path from the current contents in Git's index.

The fact that we use a commit specifier in one case, and omit it in the other, is confusing. If you have Git 2.23 or later, you can use git restore instead of git checkout. The restore command lets you specify the source of some content as either a commit or the index. If you pick a commit, you can specify whether the file is to be copied to the index, or your work-tree, or both. If you pick the index as a source, the only place you can copy the file to is your work-tree.

(The last follows from the fact that both your work-tree and Git's index can be written-to, but a commit can only be read-from.)

torek
  • 330,127
  • 43
  • 437
  • 552
  • Thank you for providing this information. I only needed the first command you mentioned and @nobalG mentioned. – Jason Foglia Jan 11 '21 at 15:42
  • 1
    @JasonFoglia: I mostly just wanted to get this down to use as a duplicate question and answer, whenever someone asks about file name case issues. – torek Jan 11 '21 at 15:57