"Dangling" commits are normal even in a healthy repository: they are the tip of a set of unreachable commits, and generally arise from things like git rebase
or git commit --amend
(which deliberately abandon one or more "old" commits in favor of the "new and improved" copy or copies). However, in a faulty repository, some of these dangling commits—and other commits reachable "behind" them—might be ones you can and would want to recover.
The missing commit is a more serious problem. Given that only one such is shown, and exactly one reference, the branch data-8989
, is invalid, it's likely that the missing commit is the one that was the tip commit of data-8989
. In this case, one (but only one) of the dangling commits may be some commit further back in that chain.
Expressed visually, a normal graph might look something like this:
o--o--o <-- feature
/ \
...--o--o--o------o--*--o <-- branch1
\
o--o-----------o <-- branch2
\
o----o--o <-- branch3
where each o
represents a commit. Each commit "points back" to its immediate parent commit, or, in the case of the merge commit (marked *
), points back to two parents, namely the two commits that were merged. The names, feature
and branch1
and so on, are how Git finds the tip of the branch. Once Git has this tip commit, Git uses the internal backward-pointing arrows to find their parent commit(s), and then uses the parents to find the grandparent commits, and so on.
Running git fsck
simply makes sure that all commits in the repository can be found (reached) this way. It's normal for some not to be. For instance, we might decide that branch3
is terrible and simply erase the label, i.e., delete the branch. The three commits that are only reachable from branch3
are now abandoned. The tip is "dangling" and the other two, Git doesn't bother mentioning.
Note that a branch tip commit need not be the end of the chain. It's just treated as if it were the last one, whenever you start at that tip. This is the case for feature
, for instance. Git does not—can not, at least not easily—look forward along a chain, because all of Git's internal arrows connecting commits go backwards only, from child to parent. (This is a major part of why git fsck
is relatively slow: it has to do a whole lot of work to, in effect, reverse the internal arrows.) It's easy to start from the tip of branch1
, work backwards, find the merge, work back-and-up, and find that you have the tip commit of feature
. It's hard to go the other way. Git therefore normally works backwards.
Now, when a repository gets damaged, we may lose the labels themselves, or we might lose some commits or other internal objects (there are four object types total but we'll concentrate on just the commits here). The ones most likely to be damaged are the most recently created (for objects) or updated (for labels). This is because, once created, no object is ever changed.1 Computer crashes tend to lose or corrupt recently-touched files, rather than older less-active files.
Consider what happens if we lose, not the name branch2
, but its tip commit:
o--o--o <-- feature
/ \
...--o--o--o------o--*--o <-- branch1
\
o--o ? <-- branch2
\
o----o--o <-- branch3
In this case we'll get a complaint that there is a missing commit, because the name branch2
says "find commit 1234567" or whatever, and it's not there. It's the one we lost.
We won't get any dangling commits, though, because the parent commit of branch2
was also on branch3
. Now that commit is only on branch3
, rather than being on both branches, because the outgoing backwards arrow is part of the commit that we lost.
If we lose the tip of branch1
we do get a dangling commit:
o--o--o <-- feature
/ \
...--o--o--o------o--* ? <-- branch1
\
o--o-----------o <-- branch2
\
o----o--o <-- branch3
The merge commit *
no longer has any way to be found, so it's "dangling". One of its parents can be found under the name feature
, and the other can be found if we restore the merge commit itself, but if we don't, and we let Git garbage-collect the unreachable commits, the graph strips down to this:
o--o--o <-- feature
/
...--o--o--o ? <-- branch1
\
o--o-----------o <-- branch2
\
o----o--o <-- branch3
Thus, given a damaged repository, if you don't have backups or other clones from which to recover "missing" items, it's wise to stop modifying it (don't add anything to it) and to run git fsck --lost-found
to make Git save any "dangling" commits into .git/lost-found/commit/
. You can then look at these (using git log
by hash ID) to see if they are valuable. Git will also save unreachable blobs (files) into .git/lost-found/other/
; you can look at the files' contents directly there with any file-viewer, and recover some lost files that way.
Your best bet, though, is to have another clone (or proper backups).
1But objects can get packed or repacked, which touches the way they are stored, so this is not a hard-and-fast rule.