223

How does the patience algorithm differ from the default git diff algorithm, and when would I want to use it?

Gabe Moothart
  • 28,997
  • 13
  • 73
  • 98
  • 1
    Maybe it matches moved code and modified lines which can be much slower – codymanix Oct 28 '10 at 16:32
  • I've extracted a standalone script for Patience Diff from Bazaar, you can find it in [another SO thread](http://stackoverflow.com/questions/4599456/textually-diffing-json/4599500#4599500). – TryPyPy Jan 05 '11 at 00:48
  • 40
    A followup question. When should I not use patience diff? – balki Nov 29 '12 at 15:32
  • 5
    There is also the `--histogram` parameter which "...extends the patience algorithm to "support low-occurrence common elements" http://git-scm.com/docs/git-diff.html – Robert May 23 '14 at 12:33

3 Answers3

189

You can read a post from Bram Cohen, the author of the patience diff algorithm, but I found this blog post to summarize the patience diff algorithm very well:

Patience Diff, instead, focuses its energy on the low-frequency high-content lines which serve as markers or signatures of important content in the text. It is still an LCS-based diff at its core, but with an important difference, as it only considers the longest common subsequence of the signature lines:

Find all lines which occur exactly once on both sides, then do longest common subsequence on those lines, matching them up.

When should you use patience diff? According to Bram, patience diff is good for this situation:

The really bad cases are ones where two versions have diverged dramatically and the developer isn't being careful to keep patch sizes under control. Under those circumstances a diff algorithm can occasionally become 'misaligned' in that it matches long sections of curly brackets together, but it winds up correlating the curly brackets of functions in one version with the curly brackets of the next later function in the other version. This situation is very ugly, and can result in a totally unusable conflict file in the situation where you need such things to be presented coherently the most.

Community
  • 1
  • 1
Mark Rushakoff
  • 224,642
  • 43
  • 388
  • 389
  • 3
    In my experience with XML for now, it gives exactly the same "bad" results as a normal diff. – stivlo Jun 23 '11 at 14:25
  • 5
    I've had much better luck with patience diff with XML; certainly the diff I'm looking at currently has exactly the misalignment problem described with the regular diff algorithm, but looks absolutely grand with patience diff. – me_and Sep 14 '12 at 13:10
  • 22
    This blog has a great explaination, including an animated gif of the process: http://alfedenzo.livejournal.com/170301.html – Quantum7 Jun 18 '13 at 23:38
  • 3
    I've found this blog very interesting and providing good explanation with further links to algorithms details: http://fabiensanglard.net/git_code_review/diff.php Hope it will be useful to someone – SathOkh Jun 24 '14 at 20:25
  • The frobnitz/fib/fact diff can be seen at https://gist.github.com/roryokane/6f9061d3a60c1ba41237 – George V. Reilly Dec 27 '17 at 05:47
54

You can also use it for merges (worked really well here for some XML conflicts):

git merge --strategy-option=patience ...
robinst
  • 26,202
  • 9
  • 95
  • 101
44

The patience diff algorithm is a slower diff algorithm that shows better results in some cases.

Suppose you have the following file checked in to git:

.foo1 {
    margin: 0;
}

.bar {
    margin: 0;
}

Now we reorder the sections and add a new line:

.bar {
    margin: 0;
}

.foo1 {
    margin: 0;
    color: green;
}

The default diff algorithm claims that the section headings have changed:

$ git diff --diff-algorithm=myers   
diff --git a/example.css b/example.css
index 7f1bd1e..6a64c6f 100755
--- a/example.css
+++ b/example.css
@@ -1,7 +1,8 @@
-.foo1 {
+.bar {
     margin: 0;
 }

-.bar {
+.foo1 {
     margin: 0;
+    color: green;
 }

Whereas patience diff shows a result that is arguably more intuitive:

$ git diff --diff-algorithm=patience
diff --git a/example.css b/example.css
index 7f1bd1e..6a64c6f 100755
--- a/example.css
+++ b/example.css
@@ -1,7 +1,8 @@
-.foo1 {
-    margin: 0;
-}
-
 .bar {
     margin: 0;
 }
+
+.foo1 {
+    margin: 0;
+    color: green;
+}

There's a good discussion of subjective diff quality here, and git 2.11 is exploring diff heuristics further.

Note that the patience diff algorithm still has some known pathological cases.

Wilfred Hughes
  • 26,027
  • 13
  • 120
  • 177