git checkout without deleting untracked files

Question

repro

mkdir gittest && cd gittest
git init
touch file1 file2
git add ./file1
git add ./file2
git commit -a -m "master init"
git update-index --skip-worktree ./file2
git rm --cached ./file2
git commit -a -m "file2 removed from master"
git checkout -b branch2
git add ./file2
git commit -a -m "file2 added to branch2 instead."
git checkout master
ls file2

--

ls: cannot access 'file2': No such file or directory

Why the bloody git is deleting file2 when master is checked out when I already told it not to track file2 in master. Is it a bug?

it is tracked by `branch2` so since this file do not exists in other branch it must be deleted. — Marek R, Nov 16 '18 at 13:03
note if branch `branch2` was not tracking this file then switch to checkout would not delete this file. In scenario when untracked file fo current branch exists in branch you are switching to git will report proper error. — Marek R, Nov 16 '18 at 13:06
@MarekR as you can see branch2 is newly created (as is the whole git repo) — ikkentim, Nov 16 '18 at 13:07
so what? He has committed this file to that branch `git add ./file2` `git commit -a -m "file2 added to branch2 instead."` — Marek R, Nov 16 '18 at 13:08
@MarekR It's not my Q, I see where it's going wrong ^_^ EDIT: As I see,my bad, I read your comment as "is it" but it says "it is". Sorry! — ikkentim, Nov 16 '18 at 13:10
but it's not happening with all files. ill post a paste of it in a min — Necktwi, Nov 16 '18 at 13:11
Give an example where "it's not happening"? That might help clear up your confusion — ikkentim, Nov 16 '18 at 13:12
https://paste.pound-python.org/show/2uIUnKihlnOI9Lkqa1hS/ notice the `.emacs.d` directory. I've added it in `RPi3B` branch but it's not deleted when `master` is checked out unlike the `.irssi` directory! — Necktwi, Nov 16 '18 at 13:19

score 0 · Answer 1 · answered Nov 16 '18 at 13:12

0

You've removed file2 on master, and added it in branch2. Since it's tracked in branch2 and is not present on master, the file is deleted. This is normal behaviour.

answered Nov 16 '18 at 13:12

ikkentim

1,534
11
25

http://paste.pound-python.org/show/2uIUnKihlnOI9Lkqa1hS notice the `.emacs.d` directory. I've added it in `RPi3B` branch but it's not deleted when `master` is checked out unlike the `.irssi` directory! – Necktwi Nov 16 '18 at 13:38
They it's either not in the index, or it is in both branches or it is in the .gitignore. – ikkentim Nov 16 '18 at 13:45
no, its only in `RPi3B` branch. is it ok if I post you the `git ls-files` with the lengthy list of `.emacs.d` files? – Necktwi Nov 16 '18 at 13:48
(RPi3B):~ Necktwi$ git branch * RPi3B master (RPi3B):~ Necktwi$ git ls-files | wgetpaste Your paste can be seen here: https://paste.pound-python.org/show/k7PMlWYJKbMS08tgyPFi/ – Necktwi Nov 16 '18 at 13:51
Is it in .gitignore? – ikkentim Nov 16 '18 at 13:54
(RPi3B):~ Necktwi$ wgetpaste .gitignore Your paste can be seen here: https://paste.pound-python.org/show/rjhLA0Di0i7tsajS9Kb5/ – Necktwi Nov 16 '18 at 14:07
(RPi3B):~ Necktwi$ git branch RPi3B * master (RPi3B):~ Necktwi$ git ls-files | wgetpaste Your paste can be seen here: https://paste.pound-python.org/show/bc0scAxotE5xmaU3XmHF/ – Necktwi Nov 16 '18 at 14:09
@neckTwi there you go, as you can see there, `.emacs.d/ac-comphist.dat.crn` is on `master`. Therefore, `.emacs.d` will exist. – ikkentim Nov 16 '18 at 15:44

score 0 · Answer 2 · answered Nov 16 '18 at 20:37

Why the bloody git is deleting file2 when master is checked out when I already told it not to track file2 in master. Is it a bug?

It's acting according to design. You might call the design a bug, but the design is not what I think you think it is, as revealed by your own phrase not to track file2 in master. The state of "being tracked", for any given file, is not a function of a branch name like master or branch2. There is, for each file, only tracked or untracked. A file is tracked if and only if it is in the index right now.

(Because of the way git checkout works, it is possible to extend this phrase to refer to specific commits, and branch names always refer to some specific commit. So it's not unreasonable to want to talk about a file being tracked in some branch, but if you do, you'll tend to mislead yourself.)

What the index is and does

To really nail this all down, we have to first talk about the index: what exactly is the index? To get a proper mental picture, start with the notion that all commits are read-only. Each commit contain every file that you had Git store with that commit. Those files are frozen (read-only), and stored in a special, compressed, Git-only form.

While storing the files forever is all well and good, we need some way to use and work on the files. For this, Git provides the work-tree. Files in the work-tree are read/write (well, generally; you can change specific files if you like). They have their normal everyday computer-use form, whatever that is on your computer. You can work with them to do whatever you need. But Git itself pays relatively little attention to them.

Most version control systems stop here: they have the frozen files in commits, and the work-tree files that you can work on. Git, however, inserts this special index thing in between the commit and the work-tree. For why you'd really have to ask Linus Torvalds, but we can observe that this does a bunch of things, including making git commit and git checkout really fast compared to those other version control systems. But it also gives us this big headache you have just run into: it provides the notion of a tracked file.

What's in the index is, normally and initially, just an un-frozen—but still compressed and Git-only format—version of the file that came out of the commit. This means that the file is perfect for Git to freeze into a new commit. So a new commit doesn't require re-compressing the file at all: it's already there, in the index. Hence, for most files, there are three active copies:

     commit             index             work-tree
----------------  ------------------  ------------------
frozen, Git-only  unfrozen, Git-only  unfrozen, ordinary

You can copy any copy to any other copy, except for the commit of course, because that copy is frozen. Copying from a commit to the index is straightforward but mostly invisible, because git checkout does that but then also copies from the index (after writing to the index) to the work-tree, so what you see is "copy file from some commit to the work-tree", not realizing there's a "to index" step in the middle.

Copying from the work-tree to the index is also straightforward: that's what git add does. The git add step compresses and Git-ifies the file during the git add, so that once it's in the index, it's ready to be frozen.

Perhaps the biggest thing the index does is that the index is always the source for a new commit.¹ When you run git commit, Git simply freezes the files that are in the index, without looking at the work-tree at all, and uses those to make the new commit. The new commit then becomes the current commit, so that the index copy and the commit copy match, in the same way that the index and commit copies of each file match right after your first git checkout of some commit.

This, then, gives us the one-line summary of what the index is: it's the set of files that you propose to put into your next commit. If you remove a file from the index, using git rm, you're proposing that your next commit won't include that file.

¹Commands like git commit -a, that seem to commit from the work-tree, really work by adding the files to the index first, then committing. When required, they make a special temporary index, add the files to the temporary index, and commit from the temporary index. This makes it look like Git is somehow committing from the work-tree, but it's not: it's committing from an index somewhere, even if it's a special temporary one.

`git checkout branch-or-commit` fills the index and the work-tree

Whenever you git checkout a commit, Git has to extract that commit's files into both the index and the work-tree. It needs the files to go into the index so that the index will match the commit. It needs the files to go into the work-tree so that you can see them. Once these are all in place, git checkout will update HEAD—which is where Git keeps track of the current commit and/or current branch—as appropriate, so that the current commit is the commit you just checked out, and you're on the branch or in "detached HEAD" mode as appropriate.

But note what just happened:

git checkout filled the index from the commit.
The contents of the index determine which files are tracked.

This means the set of tracked files changes. If you are on master and file2 is not in the index, then either file2 does not exist at all (so there's no question about it) or it exists in the work-tree and is therefore untracked. But as soon as you git checkout branch2, the commit at the tip of branch2 does have file2 in it, so file2 goes into the index and Git overwrites the work-tree copy. Now the file is tracked. If you then git checkout master, Git sees that file2 is currently tracked, but isn't in the commit you want to get to, so Git removes file2 from both the index and the work-tree.

This is the terrible danger of removing a file with git rm --cached: it leaves a copy in the work-tree, while taking it out of the index right now. But if it's in the index right now, there is a good chance it is also in some other commit(s). If you ever check those commits out, the file goes back into the index; if you then move away from that commit, the file gets removed from both the index and the work-tree, and now it's gone.

`git update-index --skip-worktree` is no help

What this does is set one of the two special control bits—the other one is --assume-unchanged—on an index entry for some file. The index entry only exists if the file is actually in the index: removing the file from the index removes the entry, and therefore removes the control bits.

When the control bits are set, various Git commands that would compare the index copy of the file to the work-tree copy of the file will skip their comparison. This means that Git won't suggest that you use git add to copy a work-tree update back into the index, and won't suggest that a file that is in the index, but for some reason is missing from the work-tree, is missing.

None of this affects what's actually in the index. The index copy of the file remains unchanged, and every new commit you make continues to snapshot the index copy of the file. It's just that git add -a doesn't update the index copy (if you've changed the work-tree copy), and git status doesn't complain that the index copy is stale (if you've changed the work-tree copy).

A side note about directories

Git never tracks a directory. Specifically, you cannot add a directory to the index. What Git does instead is that if there's some tracked file in the index that Git wants to create in the work-tree, and the file's name includes a directory that does not currently exist, Git will make the directory on its own, so that it can put the file into it.

That's mostly it, except that Git will also sometimes remove a directory once it has removed the last file from that directory. There seem to be some odd corner cases here (especially in very old versions of Git, pre-1.8) as I have had Git leave empty directories around when there was no obvious reason to do so. The git clean command will, if requested, remove empty directories from the work-tree.

git checkout without deleting untracked files

2 Answers2

What the index is and does

`git checkout branch-or-commit` fills the index and the work-tree

`git update-index --skip-worktree` is no help

A side note about directories

Linked

git checkout without deleting untracked files

2 Answers2

What the index is and does

git checkout branch-or-commit fills the index and the work-tree

git update-index --skip-worktree is no help

A side note about directories

Linked

`git checkout branch-or-commit` fills the index and the work-tree

`git update-index --skip-worktree` is no help