10

enter image description here

enter image description here

enter image description here

The above is the result of merge and rebase.

My question is that in the final state ,are C5 and C3' identical?

Or say,git rebase is equal to git merge + remove C3?

Community
  • 1
  • 1
asker
  • 2,101
  • 3
  • 19
  • 26

3 Answers3

23

The example isn't very good, because it only consider one commit (merged or rebased), giving you the impression that the resulting commits are similar. In general, a rebase will add multiple commits, while a merge will add at most one (fast-forward merges add none).

Moreover, as long as there is no conflict to solve, or if you solve said conflicts the same way each time, the final content of C3' and C5 will be the same but they remain different commits (since C3' and C5 have different parents, they'll also have different hashes, a fact that is more obvious in the illustrations below). Correspondingly, the recorded history for each is different. Note for the rebase, the history is linear, while for the merge it's a lattice.

Consider the same question when merging/rebasing several commits, as illustrated in "A Visual Git Reference" from Mark Lodato. You will see that the end result is quite different.

git checkout master
git merge other # update master with tip of branch 'other' changes

git merge other

You take only:

  • the current commit (ed489 below since you are on master),
  • the latest commit of branch other (which is a snapshot representing the full content of the repo when branched in 'other', not a delta)
  • their common ancestor (b325c), and performs a three-way merge.

For the meaning of the working directory and stage in this diagram, note the arrows going to the three-way merge, then to the working directory and stage. The working directory represents all the files that you see (on your hard drive), some of which are changed as a result of the three-way merge. The stage holds the files changed by the three-way merge, which is then used to create the new commit (f8bc5).

This is very different from a rebase which strives to reapply each and every commit of a branch on top of the destination branch:

git checkout topic # this time we are on topic
git rebase master  # means: recreate every topic commits on top of master
                           at the end, we are still on (new) 'topic' branch

git rebase master

The above command takes all the commits that exist in 'topic' but not in master (namely 169a6 and 2c33a), replays them onto master, and then moves the branch head to the new tip. Note that the old commits will be [eventually] garbage collected if they are no longer referenced.

Rebasing uses the working directory and the staging area as it replays the commits (apply changes to working directory, add changes to staging area, commit the staged changes, repeat). Once all this is done, the head of the rebased branch is set to the last of the new commits (f7e63).


2 additional differences:

Community
  • 1
  • 1
VonC
  • 1,042,979
  • 435
  • 3,649
  • 4,283
  • I now have this doubt,is the net result the same when you do `git push` ,no matter you do `git merge` or `git rebase`? Maybe the committing message will be lost though.. – asker Aug 15 '11 at 08:20
  • @asker: `git push` is about updating a *remote* repo with the new merges done locally (default being "all changes on the local branches that have matching remote branches"). If that remote repo hadn't had any new changes pushed by others, then the commits pushed will be *identical* (same SHA1). The fact that those local commits (that you are pushing) are the result of a merge or a rebase doesn't matter. – VonC Aug 15 '11 at 08:26
  • Not identical,as `git merge` will compress multiple commits into a single commit, the committing message will be lost .. – asker Aug 15 '11 at 08:32
  • I don't understand the links of **Working Directory** and **Stage** in the first chart .. – asker Aug 15 '11 at 08:55
  • @asker: when you push, you are pushing *exiting* commits (whether they are the result of merge or rebase). Those commits, if there is no conflicts, will be there on the remote repo once pushed. – VonC Aug 15 '11 at 09:02
  • @asker: working directory represent the files that you *see* (on your hard drive). Stage represents the content staged for the next commit. A merge or a rebase will update both. But there content can be different (since a merge or a rebase won't involve necessary *all* your files, but only certain of them) – VonC Aug 15 '11 at 09:04
  • I mean, the code will be there on the remote repo, but the message you specified with `-m "messages ..."` will be lost. In this regard, `rebase` is better. – asker Aug 15 '11 at 09:11
  • @asker: rebase is better if you are not rewriting history of commits that were already pushed. – VonC Aug 15 '11 at 09:13
  • @VonC: does the final set of the code are the same after `git merge` or `git rebase`? Is that the only different is the commit object is different? – Kit Ho Aug 15 '11 at 09:24
  • Will **Working Directory** and **Stage** change if you switch to another branch? – asker Aug 15 '11 at 09:28
  • @asker: yes, they will change. – VonC Aug 15 '11 at 09:33
  • @Kit Ho: the final *content* will be the same (if there is no conflict to solve, or if you solve said conflicts with the same way each time). But since their parent will differ, their SHA1 won't be the same. – VonC Aug 15 '11 at 09:34
  • The `SHA1` is calculated on local when you do `git commit`, this can cause conflict when you do `git push`,right? – asker Aug 15 '11 at 09:50
  • @asker: yes, if other commits have been pushed to the same repo from another downstream repo. (By the way, for the meaning of upstream vs.downstream, see http://stackoverflow.com/questions/2739376/definition-of-downstream-and-upstream/2749166#2749166) – VonC Aug 15 '11 at 09:52
  • What does git do when that conflict happens? – asker Aug 15 '11 at 09:59
  • @asker: it forces you to replay similar commit. That is bad. See "[RECOVERING FROM UPSTREAM REBASE](http://kernel.org/pub/software/scm/git/docs/git-rebase.html#_recovering_from_upstream_rebase). – VonC Aug 15 '11 at 10:04
  • Does each `commit` contains the `diff` of **changed files**, or the `snapshot` of **changed files**? – asker Aug 16 '11 at 03:52
  • @asker: snapshot. Git is a *content* management system as its core. Not a delta file-based system like most of the other VCS. You will find delta only when pushing/pulling, in order to minimize the amount of data transferred. See http://stackoverflow.com/questions/995636/popularity-of-git-mercurial-bazaar-vs-which-to-recommend/995799#995799 and its associated answer: http://stackoverflow.com/questions/612580/how-does-git-solve-the-merging-problem/612747#612747. – VonC Aug 16 '11 at 04:05
  • @asker: no, every Git repo (remote or local) contains snapshots, even though they can stored packed files (based on delta) internally, but that is an *implementation* detail. From the user point of view, you are still dealing with snapshots. See "Are Git's pack files deltas rather than snapshots?" http://stackoverflow.com/questions/5176225/are-gits-pack-files-deltas-rather-than-snapshots, and http://www.kernel.org/pub/software/scm/git/docs/technical/pack-format.txt, and http://www.kernel.org/pub/software/scm/git/docs/technical/pack-heuristics.txt for all the details. – VonC Aug 16 '11 at 04:14
  • You mentioned that `when you push, you are pushing exiting commits `,but it seems this is `false`. I just tried to commit twice, and then push.If it's only pushing exiting commits,there should be only 1 commit, but the fact is that both commits are pushed. – asker Aug 16 '11 at 04:25
  • @asker "*exiting* commits"? I meant "existing commit" (that hadn't been pushed yet). In your case, your two new existing commits are pushed as expected. – VonC Aug 16 '11 at 05:52
  • I just went over the article on `RECOVERING FROM UPSTREAM REBASE`,but it's different from the case I meant.My case is simpler than that: A and B are both pushing commit to the same remote repo, and accidently they have two commit with the same SHA-1 result(this is extremely rare, but it does have chance because SHA-1 is calculated independently on A and B),what does Git do in that case? – asker Aug 16 '11 at 06:11
  • @asker: not sure. SHA1 collisions are *extremely* rare (http://lwn.net/Articles/307281/). Not much, according to Linus (http://kerneltrap.org/mailarchive/git/2006/8/28/211065). Your case might be simpler, but also very improbable. – VonC Aug 16 '11 at 06:15
  • So git doesn't deal with this case at all? It's very improbable but still may happen.. – asker Aug 16 '11 at 06:19
  • @asker: considered that "this case" is likely to *never* happen, it won't do much except ignoring one of the commits (quote): "If it has the same SHA1, it means that when we receive the object from the other end, we will _not_ overwrite the object we already have". Don't forget that it isn't enough to push to identical files for them pushing identical SHA1. A SHA1 of a *commit* (which references a tree, which references the files or actually the blobs you have made) is also based on the SHA1 of the *parents* of said commit. So chances to push a *commit* with identical SHA1 are slim. – VonC Aug 16 '11 at 06:26
  • I think if that happens, the acceptable solution is to do(manually of course:)) `Cherry Pick` on the commit with the conflict SHA-1 but not yet pushed, and **replace the said commit with the new one**. But how to do the step in bold? – asker Aug 16 '11 at 06:34
  • @asker: considering that it is likely to never happen, I wouldn't spend too much time on that ;) But should you do it, I would rebase the local commits (`rebase --interactive`) in order to remove my local commit. That way, a simple git pull will import the remote commit, without any more conflicts since the local one has been purged. – VonC Aug 16 '11 at 07:24
  • Do you have any recommendations on books about git internals/implementations?I just downloaded a book call but disappointedly it doesn't say much about the internals. It's mostly from user aspect,while I want to see it from developer aspect. – asker Aug 16 '11 at 07:34
  • @asker: The chapter 9 of Pro Git is a good start: http://progit.org/book/ch9-0.html. Other references are mentioned in http://stackoverflow.com/questions/866185/where-can-i-find-a-tutorial-on-gits-internals – VonC Aug 16 '11 at 07:38
  • Maybe you're interested to join the follow-up question: http://stackoverflow.com/questions/7074953/how-to-change-the-sha-1-of-a-specific-commit-in-place-with-git – asker Aug 16 '11 at 07:58
  • Sorry to 'bump' this up, but when would one want to rebase rather than merge? It seems like merge has a clearer commit history to follow. – Major Productions Sep 18 '12 at 15:26
  • 1
    @kevinmajor1 `git rebase` has its use: see http://stackoverflow.com/questions/457927/git-workflow-and-rebase-vs-merge-questions/457988#457988 and http://stackoverflow.com/questions/804115/git-rebase-vs-git-merge?lq=1 – VonC Sep 18 '12 at 15:38
1

No. C5 and C3' will have different parent commits, meaning they will themselves be different.

If you are asking whether the root treeish referenced by C5 and C3' will be identical, then yes (assuming that any conflicts were resolved the same way). In other words, the tree of files "contained in" both commits will be the same.

cdhowie
  • 133,716
  • 21
  • 261
  • 264
  • `git rebase` is equal to `git merge` + remove **C3**? – asker Aug 15 '11 at 06:26
  • Not precisely, but the resulting treeish should be identical, yes. – cdhowie Aug 15 '11 at 06:31
  • Why do you say not precisely ?IMO the're exactly the same, when remove C3,will point `experiment` to **C5**. – asker Aug 15 '11 at 06:33
  • In this case, they are roughly identical. But they would not be identical if you were rebasing more than one commit. – cdhowie Aug 15 '11 at 06:36
  • Isn't rebasing more than one commit just doing rebasing one commit more than once? If they're the same when doing once,they're still the same after doing more than once.. – asker Aug 15 '11 at 06:39
  • I mean rebasing multiple commits at once -- if the branch you are rebasing has diverged by more than one commit. – cdhowie Aug 15 '11 at 06:45
  • IMO they're still the same in that case,what's different otherwise? – asker Aug 15 '11 at 06:48
  • Because rebasing two commits onto the tip of another branch is not the same as merge-and-remove (actually merge-and-rewrite) since it will introduce intermediate commits. – cdhowie Aug 15 '11 at 06:54
  • Can you elaborate? I don't think there's need to intermediate commits, rebase is equal to merge+point branch to result commit. – asker Aug 15 '11 at 06:58
  • No, `git rebase` is **not** equivalent to that. You are thinking of `git rebase --squash`. `git rebase` will take **every commit** that the two branches do not have in common and apply it to the target branch. – cdhowie Aug 15 '11 at 07:00
  • `take every commit that the two branches do not have in common and apply it to the target branch`,that's exactly what `git merge` does. – asker Aug 15 '11 at 07:05
  • You're not understanding what I'm saying. It is simply never true that rebase and merge produce an identical commit, and neither will merging and removing a parent reference produce an identical result to a rebase in every case. Rebase and merge will produce commits that have the same file tree, but the commits will not be identical. – cdhowie Aug 15 '11 at 07:09
  • Do you have a better graph to illustrate it? – asker Aug 15 '11 at 07:11
  • There are graphs explaining merge and rebase at http://eagain.net/articles/git-for-computer-scientists/ – Esko Luontola Aug 15 '11 at 07:24
  • See this graph: http://i51.tinypic.com/2h57mm9.png -- note that while the merged and rebased commits will have an identical tree of files, the commits themselves will be very different. – cdhowie Aug 15 '11 at 07:36
0

If you look at just the contents of the commits (i.e. not what are their parents) then both C5 and C3' contain the same thing (assuming there were no merge conflicts or other things requiring manual change). So somebody could think of it as being the same as if C3 had been removed, for some definition of "remove C3". But in Git it's not possible to remove any commits (all commits are immutable), so the operation of removing a commit from the tree is not defined for Git.

Esko Luontola
  • 71,072
  • 15
  • 108
  • 126
  • Wrong, AFAIK `git commit -a --ameng` will also destroy a state,search destroy in this article: http://cworth.org/hgbook-git/tour/ – asker Aug 15 '11 at 06:43
  • No. It creates a new commit. The old commit is not lost and you can get it back with `git reflog` (or `git checkout` if you remember the commit hash). The only command which removes commits is `git gc`. – Esko Luontola Aug 15 '11 at 06:48
  • A good article for understanding how Git works is http://eagain.net/articles/git-for-computer-scientists/ – Esko Luontola Aug 15 '11 at 06:50
  • It won't remove commits which are reachable from a branch, and it cannot change the parents of any commit. – Esko Luontola Aug 15 '11 at 06:51
  • OK,let's define "remove C3" as `points the branch that currently points to C3 to the final commit`.Under this definition , is `git rebase` equal to `git merge` + remove C3? – asker Aug 15 '11 at 06:53
  • @asker: In this case it definitely wouldn't be `git commit` **-a** `--amend`. You are trying to keep the tree, so adding all changes in working directory is not the thing you want. But --amend won't let you change parent list anyway; you'd have to do `git reset --soft HEAD^; git commit -c HEAD@{1}` to drop the other parent. – Jan Hudec Aug 15 '11 at 07:02
  • "points the branch that currently points to C3 to the final commit" would be `git branch -f experiment master` or `git reset --hard master` which doesn't add/remove/modify any commits. What `git rebase master` equals is something like `git checkout C5; git cherry-pick C3; git branch -f experiment` – Esko Luontola Aug 15 '11 at 07:19
  • Of course "remove C3" doesn't create any commits, the commit is created by `git merge`,that's why I say `git rebase` equals to the combination of them two. – asker Aug 15 '11 at 07:23