38

When I was taught using Git with central repo (project on Gitorious) I was told to always use rebase instead of merge because we want to have linear history. So I have always been trying to work that way.

Now when I come to think about it is it really so beneficial? Rebasing branch with many commits is much more time consuming then simple merge.

There are 2 advantages that come to my mind right now:

  1. git bisect
  2. Possibility of submitting with history to another version control system like SVN.

Are there any other benefits?

Lii
  • 9,906
  • 6
  • 53
  • 73
Jarosław Jaryszew
  • 902
  • 1
  • 8
  • 13
  • 3
    See: http://stackoverflow.com/q/457927/1256452 http://blogs.atlassian.com/2013/10/git-team-workflows-merge-or-rebase/ http://blog.sourcetreeapp.com/2012/08/21/merge-or-rebase/ http://www.derekgourlay.com/archives/428 – torek Dec 03 '13 at 11:11
  • 1
    Those are the only two reasons I know of. I prefer commit bubbles, so I can see separated-out, targeted build efforts on their original branches. – Gary Fixler Dec 03 '13 at 11:48

4 Answers4

37

As far as I am concerned, keeping a linear history is mostly for asthetics. It's a very good way for a very skilled, disciplined developer to tidy up the history and make it nice to skim through. In a perfect world, we should have a perfectly linear history that makes everything crystal clear.

However, on a large team in an enterprise environment I often do not recommend artificially keeping a linear history. Keeping history linear on a team of sharp, experienced, and disciplined developers is a nice idea, but I often see this prescribed as a 'best practice' or some kind of 'must do'. I disagree with such notions, as the world is not perfect, and there are a lot of costs to keeping a linear history that people do not do a good job of disclosing. Here's the bullet list overview:

  • Rewriting history can include erasing history
  • Not everybody can even rebase
  • The benefits are often overstated

Now, let's dig into that. Warning: Long, anecdotal, mostly ranting

Rewriting history can include erasing history Here's the problem with rebasing all of your commits to keep everything nice and linear: Rebasing is not generally loss-less. There is information- real, actual things that were done by the developer - that may be compressed out of your history when you rebase. Sometimes, that's a good thing. If a developer catches their own mistake, it's nice for them to do an interactive rebase to tidy that up. Caught-and-fixed mistakes have already been handled: we don't need a history of them. But some people work with that one individual who always seems to screw up merge conflicts. I don't personally know any developers named Neal, so let's say it's a guy named Neal. Neal is working on some really tricky accounts receivable code on a feature branch. The code Neal wrote on his branch is 100% correct and works exactly the way we want it to. Neal gets his code all ready to get merged into master, only to find there are merge conflicts now. If Neal merges master into his feature branch, we have a history of what his code originally was, plus what his code looked like after resolving the merge conflicts. If Neal does a rebase, we only have his code after the rebase. If Neal makes a mistake when resolving merge conflicts, it will be a lot easier to troubleshoot the the former than it will be the latter. But worse, if Neal screws up his rebase in a sufficiently unfortunate way (maybe he did a git checkout --ours, but he forgot he had important changes in that file), we could altogether lose portions of his code forever.

I get it, I get it. His unit tests should have caught the mistake. The code reviewer should have caught the mistake. QA should have caught the mistake. He shouldn't have messed up resolving the merge conflicts in the first place. Blah, blah, don't care. Neal is retired, the CFO is pissed because our ledger is all screwed up, and telling the CFO 'according to our development philosophy this shouldn't have happened' is going to get me punched in the face.

Not everybody can even rebase, bro. Yes, I've heard: You work at some space age startup, and your IoT coffee table uses only the coolest and most modern, reactive, block-chain based recurrent neural network, and the tech stack is sick! Your lead developer was literally there when Go was invented, and everybody who works there has been a Linux kernel contributor since they were 11. I'd love to hear more, but I just don't have time with how often I'm being asked 'How do I exit git diff???'. Every time someone tries to rebase to resolve their conflicts with master, I get asked 'why does it say my file is their file', or 'WHY DO I ONLY SEE PART OF MY CHANGE', and yet most developers can handle merging master into their branch without incident. Maybe that shouldn't be the case, but it is. When you have junior devs and interns, busy people, and people who didn't find out what source control is until they had already been a programmer for 35 years on your team, it takes a lot of work to keep the history pristine.

The benefits are often overstated. We've all been on that one project where you do git log --graph --pretty and suddenly your terminal has been taken over by rainbow spaghetti. But history is not hard to read because it's non-linear...It's hard to read because it's sloppy. A sloppy linear history where every commit message is "." is not going to be easier to read than a relatively clean non-linear history with thoughtful commit messages. Having a non-linear history does not mean you have to have branches being merged back and forth with each other several times before reaching master. It does not mean that your branches have to live for 6 months. An occasional branch on your history graph is not the end of the world.

I also don't think doing a git bisect is that much more difficult with non-linear history. A skilled developer should be able to think of plenty of ways to get the job done. Here's one article I like with a decent example of one way to do it. https://blog.smart.ly/2015/02/03/git-bisect-debugging-with-feature-branches/

tldr; I'm not saying rebases and linear history aren't great. I'm just saying you need to understand what you're signing up for and make an informed decision about whether or not it's right for your team. A perfectly linear history is not a necessity, and it certainly isn't free. It can definitely make life great in the right circumstances, but it will not make sense for everyone.

Dogs
  • 1,967
  • 15
  • 15
  • 3
    I've always thought the same but never been able to articulate so well like you did. Thank you. Will share this with whoever tell me `git rebase` like it's the best thing to do – ericn May 31 '19 at 05:35
  • 3
    To add to your answer, if you have enabled a checkpoint such as no merge into the base branch without a minimum set number of approvals, each time you have new changes on master and you rebase your topic branch on `master`, your VCS will forget about the approval, since the history has been rewritten. Eventually, if there is a delay between `Approve` and `Merge` the developer will have to bug the approver once again, unnecessarily. – Anuj Kumar Jul 01 '19 at 06:15
  • 2
    This answer is long and includes anecdotal/straw-man arguments. Not saying it's wrong, but it would be nice if we could slim it down and present the actual problems solved. Like, fx. merge-commits can make it difficult to split a branch into 2, if you find out you only want to release half of the work. – Vargr Mar 03 '20 at 13:16
19

A linear Git history (preferably consisting of logical steps), has many advantages. Apart from the two things already mentioned, there is also value in:

  1. Documentation for the posterity. A linear history is typically easier to follow. This is similar to how you want your code to be well structured and documented: whenever someone needs to deal with it later (code or history) it is very valuable to be able to quickly understand what is going on.
  2. Improving code review efficiency and effectiveness. If a topic branch is divided into linear, logical steps, it is much easier to review it compared to reviewing a convoluted history or a squashed change-monolith (which can be overwhelming).
  3. When you need to modify the history at a later time. For instance when reverting or cherry-picking a feature in whole or in part.
  4. Scalability. Unless you strive to keep your history linear when your team grows larger (e.g. hundreds of contributors), your history can become very bloated with cross branch merges, and it can be hard for all the contributors to keep track of what is going on.

In general, I think that the less linear your history is, the less valuable it is.

m-bitsnbites
  • 744
  • 4
  • 18
  • 10
    Except that in many practical applications, a linear code history is almost impossible to manage. As soon as many different people work together and you need to have more than just one or two branches things get more complicated and confusing if you try to force a linear history. IMHO, git is especially strong because of its branching/merging features. – mattmilten Oct 10 '16 at 09:07
  • 2
    I think that the advantages mentioned above still hold. Whether they outweigh the disadvantages is another question. A linear history often means (slightly) more work, but on the other hand, so do unit tests, linting, code comments and documentation. It's really a question of what quality requirements you need for your project. – m-bitsnbites Nov 02 '16 at 07:19
  • Linear history does not detract from the merits of simply creating branches. Workers branches before merging with the main branch can and should be changed, and this can be done simply due to the potential of Git. – ilyar Nov 29 '18 at 10:35
  • Besides the current moment, the main providers of repositories have the tools to simply maintain a linear or semi-linear history. Bitbucket - When to do a fast-forward merge GitHub - Allow rebase merging GitLab - Merge commit with semi-linear history - Fast-forward merge – ilyar Nov 29 '18 at 10:35
  • 2
    I agree with the listed advantages. I’ll especially note 5 point, it is very easy to use **git revert** and **git cherry-pick** in a linear history. – ilyar Nov 29 '18 at 10:49
  • At least 4,5 and maybe 3 are true of non-linear history. – D. Ben Knoble Aug 13 '19 at 12:36
  • When you have a team of 10+ teams of 2-4 devs each working on the same project (feature-based/trunk-develop style), none of these things are a benefit. Linear histories are (in my limited experience) a waste of time. You rarely/never need to go back and use the "benefits" of it. "Extraneous commits" -> people are unaware you can rename "merge" commit's message to something more useful, right? – Martin Marconcini May 18 '21 at 10:25
2

Here is a pro/con list of rebase or squashing commits when merging.

Pro:

  • Easy to read chronological history
  • No extraneous commits in history
  • Splitting branches/PR's into smaller chunks for easier review, and testing is easier without merge commits
  • Being able to rebase feature branches easily (Fx. to change merge order)
  • It encourages meaningful commit messages (but does not enforce it)
  • When doing a release, one can more easily look at the history and use it to write a change-log, for use by testers, consumers etc.
  • The commit by commit rebase encourages PR's with less/smaller commits

Con:

  • It's more complicated, less people have experience with it
  • Can lead to problems if used on remote branches without care. Examples, lost commits, inconsistent history across local environments
  • It's an extra thing to maintain (Does x project really need this?)
  • More time consuming if you rebase a branch with many conflicting commits

Personal note

I mentioned squashing before merge as well because it has a similar effect in terms of history.

Also, most github and other services have options to rebase-merge and squash-merge, right there in the GUI.

Making it practically a free action for PR's in some cases.

Vargr
  • 1,418
  • 1
  • 10
  • 10
0

With linear history, you can easily track history for a single file across renames with git log --follow. Quoting the documentation on the log.follow config option:

If true, git log will act as if the --follow option was used when a single <path> is given. This has the same limitations as --follow, i.e. it cannot be used to follow multiple files and does not work well on non-linear history.

Eugene Yarmash
  • 119,667
  • 33
  • 277
  • 336
  • Since renamed files is a pretty interface thing and not really core to git (that is, git doesnt track the underlying file in any way, really), relying on linear history for rename-tracking is a waste of the power of git. But it is one advantage, I suppose. – D. Ben Knoble Aug 13 '19 at 12:24