Hat tip to Randy Fay's "Gory Story" about git, which made me remember the case below (which is based on a true story) and the unresolved question it leads to.
The question is: Is there any git command (or shell script) to find merge commits which discard all changes to a file from one of the branches, and just includes the changes from the other?
Note: "The other" branch might or might not change the file in question. If there is no solution for the general case where both branches change the file, I'd also be happy with a solution which works if "the other" branch didn't change the file content (i.e. the file content is taken from the common ancestor of both branches), as shown in the example below.
For clarification, please read the following scenario.
What happened so far...
Here we have a simple git history. You can clone it from github.
e---f
/ \
-a---b---c---d--...-x-y-z <-master <-bob
\
g---h <-alice
Both Alice* and Bob* are working on the project. They are not git professionals but had a simple git introduction by their company.
This is how the history came to pass:
- Alice pushes
c
which changesmain.c
,feature.c
andfeature_test.c
. Then she continues working onfeature.c
ing
undh
(locally). - Bob commits
e
andf
with conflicting changes inmain.c
. - Bob does
git pull
and gets a merge conflict, so he talks to Alice. - They agree to go with Bob's change and discard Alice's.
- Bob somewhere heard something about
ours
doing that, so he entersgit merge ours
into his favorite search engine and finds an SO pagegit merge -s ours
. - Bob executes
git merge -s ours master
. The merge applies cleanly andmain.c
looks right. He didn't touchfeature.c
and doesn't know it, so he leaves it alone. So he's happy and pushes the merge toupstream/master
.- Note, Bob has just undone Alice's work on
feature.c
. He has also deleted the test cases, which Alice added infeature_test.c
, so he won't even get a failing test to alert him to the problem.
- Note, Bob has just undone Alice's work on
- Development continues from there on, people commit, push and merge happily, until...
- Alice discovers the problem when she tries to merge commits
g
andh
intomaster
, because then her new changes tofeature.c
will not apply cleanly onto the oldfeature.c
from before commitc
.
Diagnosis
I can tell you, Alice won't have an easy time finding out where her merge conflicts came from. And when she finds it, she's not going to be pleased about Bob.
It's not easy to diagnose where her merge conflict came from, because Alice can easily see in the history that nobody but her ever touched these files. For example git log -- feature.c
will only show commit b
by Alice (which created the file), but neither c
nor d
. In this example, alice might notice the absence of c
in that list, but in a real project, spotting a single missing commit among dozens is not that easy.
Her next step might be git bisect
which will tell her that commit d
introduced the problem. Thus, she inspects the commit:
$ git show --patch d
commit a69d49a390b9e92a0fcd60f0396d08a4b839a8c1
Merge: fb1a69b 6ad85a1
Author: Bob <bob@x.com>
Date: Tue May 17 15:20:37 2016 +0200
Merge branch 'master' into bob
$
What? How does commit d
break feature.c
if it doesn't even contain any changes and much less touches the file feature.c
? However, Alice remembers that she can show the difference between two commits, so she decides to look at the changes that d
introduced relative to its two parents d^1
(aka f
) and d^2
(aka c
).
$ git diff d^1 d
$ git diff d^2 d
diff --git a/feature.c b/feature.c
index 2a8efa8..621e63a 100644
--- a/feature.c
+++ b/feature.c
@@ -1,4 +1,2 @@
feature 1
-modified feature 2
-feature 3
-feature 4
+feature 2
diff --git a/feature_test.c b/feature_test.c
index fbc719a..3d14bd1 100644
--- a/feature_test.c
+++ b/feature_test.c
@@ -1,4 +1,2 @@
test feature 1
-modified test feature 2
-test feature 3
-test feature 4
+test feature 2
diff --git a/main.c b/main.c
index e2f8683..2cd1198 100644
--- a/main.c
+++ b/main.c
@@ -1,6 +1,8 @@
line 1
line 2
-alice 3
-alice 4
+bob 3
+bob 4
line 5
line 6
+bob 7
+non 8
$
Aha! So this way you can see what a merge commit actually did. It looks like d
is actually exactly the same as f
, and all changes in all files from c
are undone.
Now, one might argue that this is all Bob's fault for using git commands without completely understanding them. However, neither is he the first to do that, nor does blaming him really help anyone.
What matters is: Are there other such cases in the project's history where an inconspicuous merge commit silently undoes the work from a whole branch? This would be especially "great" for a branch which fixed a subtle bug that occurs only once every dew months and took ages to find to fix.
In the case above, the file feature.c
was changed in only one of the branches. So after the merge, feature.c
is reset to the version from the common ancestor (git merge-base f c
). The above might be even harder to detect if Bob had also done a non-conflicting change to feature.c
, so that it is different from the common ancestor. This is what I referred to as the "general case" in the introduction.
* Names changed.