42

How does git determine that a particular merge has a conflict and what the conflict is?

My guess would go something like this: if the two commits being merged have a common parent commit, and if they have both changed line X from what the parent had, that's a conflict.

What complicates my understanding is:

  • "Changing line X" can mean replacing it with several new lines, and that's still shown as one conflict (version A has this one line, and version B has these 5 lines, or whatever)
  • If you did insert lines in one of the commits, a dumber algorithm would think that all subsequent lines had changed: Line 30 now has the former contents of line 25, 31 has the former contents of 26, etc. But git can tell that those are the same, and I don't know how.

Can anybody explain how this works, or point me to a link that does?

Nathan Long
  • 113,812
  • 91
  • 316
  • 418

3 Answers3

23

Basically, with git, every merge is a conflict, which leaves you with an index that contains three versions of each file, the versions from each branch and the base. On this index, various resolvers are run, which can decide for each individual file how to resolve the matter.

The first stage is a trivial resolver, which takes care of things like unchanged files, cases where one branch has modified a file while the other didn't, or where both branches contain the same new version of the file.

Afterwards, it's plugins that look at the remaining cases. There is a plugin that handles text files by identifying individual changes (like diff) in one branch and trying to apply those to the other branch, falling back on placing conflict markers if that doesn't work. You can easily hook in your own merge tool at this point, for example, you could write a tool that knows how to merge XML files without violating well-formedness, or that gives a graphical user interface that allows interactive editing and a side-by-side view (for example, kdiff3 does that).

So the presentation of conflicts is really a matter of the plugin used; the default plugin for text files will use the same style as CVS did, because people and tools are used to it, and the conflict markers are a known syntax error in almost any programming language.

Simon Richter
  • 26,160
  • 1
  • 38
  • 59
  • Good answer! The issue of merging was illuminated for me. – Will Sheppard Jul 12 '12 at 13:07
  • "every merge is a conflict" whaaat? – Mark E. Haase Sep 28 '12 at 15:54
  • 1
    @mehaase, this is how it works. :) As a user, a large part is hidden from you (the trivial resolvers, for example), but this is important to know if you want your own resolvers. – Simon Richter Sep 28 '12 at 16:42
  • "It's plugins that look at the remaining cases" can be slightly misleading, as it is not possible to hook in a custom-made program that would be called by git merge-file. Only an interactive merge-tool can do that (like kdiff3). – DomQ Jul 26 '13 at 10:23
  • 1
    @DomQ, `merge-file` is the last-resort merge driver for text files. You can specify that a different merge driver should be used instead. – Simon Richter Jul 26 '13 at 13:19
  • I assume the "base" from "the versions from each branch and the base" means the nearest common ancestor commit? If so, how is "nearest" defined (number of commits)? Does it make any difference if the file was modified in the "base" commit (probably not)? – Jan Żankowski Jun 17 '16 at 10:36
  • 1
    @JanŻankowski, as far as I know, it is number of commits. Having multiple candidate commits where one isn't an ancestor of the other is rather uncommon, but there is the option of specifying which base to use in the merge precisely for that case. The differences between the base version and its predecessors don't matter. – Simon Richter Jun 17 '16 at 11:25
  • @JanŻankowski (and others): the technical definition of merge base is from applying the Lowest Common Ancestor algorithm to a DAG rather than a tree. This may produce more than one merge base, and in that case it's up to the merge strategy (`-s resolve` vs `-s recursive`) to decide how to handle it. `-s resolve` picks one at apparent-random; `-s recursive` merges the merge bases, hence the name "recursive". (Other merge strategies can do whatever they want but `-s` style strategies are hard to write so you probably don't have custom ones.) Use `git merge-base --all` to find the merge bases. – torek Jun 21 '19 at 15:29
  • For more information, and information about exotic cases (e.g., `-s octopus`), see also [the `git merge-base` documentation](https://git-scm.com/docs/git-merge-base), especially the DISCUSSION section. – torek Jun 21 '19 at 15:34
7

I don't think the merge algorithm has anything special with Git: it is a classic 3-way merge algorithm (not the Codeville one), which can be used with several strategies (default: recurse, or resolve or octopus). The result is a fairly simply merge process which is described here.
Any visualization need is then delegated to third-party merge/diff tools.

Community
  • 1
  • 1
VonC
  • 1,042,979
  • 435
  • 3,649
  • 4,283
3

Browse to the HOW CONFLICTS ARE PRESENTED paragraph on this page.

LE: There is no real documentation for the conflict cases nor file conflict markers and since i am getting bashed in the comments here, here's the pointers in the source code that lead somewhere close to what strategies does git follow in order to achieve a conflict state. File merge-recursive.c, search for the "CONFLICT string. By doing that we can easily find out that there are really a handful of conflict cases like:

  • CONFLICT (rename/rename)
  • CONFLICT (content)
  • CONFLICT (rename/directory)
  • CONFLICT (rename/delete)
  • CONFLICT (rename/add)
  • CONFLICT (delete/modify)
  • ... ans so on

If you ask me, yes they should be documented and clearly pointed out, but they aren't so nothing left to do then inspect the source.. but someone can really pick up from here and create a nice documentation and then send it to the git project.

@Wim Coenen yes it depends on merge strategies too, but how conflicts are presented gives much more of an insight. Then you can read merge strategies too if you ask me, but you still remain in doubt.

Shinnok
  • 5,849
  • 5
  • 28
  • 44
  • You may as well have said 'RTFM'. I personally find `git`'s documentation to be a little obscure in this regard. – Blair Holloway Feb 07 '11 at 11:51
  • I'm not downvoting this because it's an interesting link, but it isn't exactly what I'm asking for. That paragraph shows what it looks like **after** git had decided there's a conflict. I'm trying to learn how git decides that there's a conflict in the first place. – Nathan Long Feb 07 '11 at 11:53
  • The relevant paragraph is actually MERGE STRATEGIES – Wim Coenen Feb 07 '11 at 12:13
  • 1
    @Wim Coenen - That paragraph implies that the answer to my question depends on the merge strategy that you choose. Which is interesting, but it's like "hey, you thought you didn't understand ONE thing, here's TEN things you don't understand instead!" :) Maybe we can work through the default strategy for the purposes of this question? – Nathan Long Feb 07 '11 at 12:24