25

Frequently, my colleagues will make some changes to an open pull request, rebase their local branch against the base branch - often squashing their changes into previous commits as well - and force-push.

How can I see what changed between the old version of the PR and the new version of the PR?

I guess I could do a git pull and git checkout $BRANCH_NAME when the PR was first raised, then git fetch and then git diff $BRANCH_NAME..origin/$BRANCH_NAME after the PR was updated - but that will also show changes that have been introduced into the base branch (typically master) and brought into the PR via a rebase. Is it possible to remove that noise and just show what has changed in the PR itself?

Robin Green
  • 29,408
  • 13
  • 94
  • 178
  • Actually, if we all used TopGit, and pushed our TopGit branches as well, we wouldn't have this problem, because we could just look at each other's TopGit branches to see what had changed - and that would also solve the rebase problem, because TopGit does merges instead of rebases, and then exports (which is like a squashing rebase that doesn't change history) when publishing a PR. That would require my colleagues to change their git workflow, though. – Robin Green Sep 25 '16 at 08:18
  • Although, on second thoughts, that wouldn't always solve the problem because any controversial merge resolutions could get hidden inside merge commits. – Robin Green Sep 26 '16 at 08:15
  • Side note: GitHub supports the "rebase on merge" merge strategy now, so you don't actually need to rebase pull requests any more – scrowler Nov 03 '16 at 22:22
  • 2
    Rebasing on merge can break the build. So can non-trivial merges, of course. I recommend rebasing and rebuilding just before merging. – Robin Green Nov 03 '16 at 22:37
  • So do I to be fair. I'm not sure that what you're asking is possible. Rebasing rewrites Git history, so unless you have the branch locally then you won't be able to compare it against the rebased version – scrowler Nov 03 '16 at 22:38

5 Answers5

14

the old version of the PR

You can do so directly on GitHub: see "Find committer of a force push on GitHub"

Clicking the “force-pushed” link will show a two dot comparison between the two commits.


Original answer: 2016

That would be only available in the reflog of the remote repo, which would include the previous HEAD of the branch force-pushed.
Since the remote repo is a GitHub one, you still can infer the old commit by looking at push events: see "Does github remember commit IDs?".

hat will also show changes that have been introduced into the base branch (typically master)

More exactly, you will always have the differences against a common ancestor (which will include commits from the base branch like master)

See What are the differences between double-dot ".." and triple-dot "..." in Git diff commit ranges?

http://mythic-beasts.com/~mark/git-diff-help.png

So in your case, your forced-pushed branch looks like this on the remote repo:

      x--x--x        (old branch in reflog)
     /
 m--M0--M--M   (master)
            \
             X--X--X (new branch forced push)

A diff old_HEAD..newHEAD would include the few M commits from the base branch, as they are part of the common ancestor (M0) path.

So you can compare a force-pushed branch (providing you are monitoring pushEvents and know of the previous HEAD of that branch).
But uou cannot easily compare two branches without their common ancestor path.

VonC
  • 1,042,979
  • 435
  • 3,649
  • 4,283
9

Checkout this answer to another question which want to do something very similar to what you are trying to achieve.

It describe your situation like this:

newcommit -> the new pull request commit
oldcommit -> the old pull request commit
upstream -> the base branch commit the new pull request is based on

Now do this:

git commit-tree newcommit^{tree} -p oldcommit -p upstream -m "message"
git show <commit id returned by previous command>

The idea is that commit-tree will fake a merge between oldcommit and upstream producing newcommit tree and thus containing exactly the code of newcommit. It does not modify your current branch at all, it create a new headless commit and give you its ID. This means git show will list every modification as a conflict resolution, which is the exact difference between the new PR and the old one.

To be able to do that you need to have the previous PR in your git repository somewhere (if a force push has been performed the git history has been rewritten and can't be recovered unless you have it on your pc or you have access to the server reflog). Check VonC answer for details about this.

Assuming:

  • base branch: master
  • you have locally the old PR branch: $BRANCH_NAME
  • the new PR in a remote branch: origin/$BRANCH_NAME

You can do like this:

# fetch locally upstream changes (origin/$BRANCH_NAME)
git fetch
# produce the fake merge commit
git commit-tree origin/$BRANCH_NAME^{tree} \
         -p $BRANCH_NAME \
         -p `git merge-base master origin/$BRANCH_NAME` \
         -m "message"
# see "fake" conflict resolution = difference between the two PR
git show <commit id returned by previous command>

the git merge-base is used to find the common ancestor between two branches, in this case to find the commit on which the new PR is based on in the base branch, if you prefer you can write the commit ID directly.

Community
  • 1
  • 1
Daniele Segato
  • 10,371
  • 4
  • 55
  • 80
  • That seems clever (and has more potential than my answer) +1 – VonC Nov 10 '16 at 11:47
  • 1
    Your answer explain the concept of reflog which is useful anyway :) I didn't write it because your was good enough in explaining it ;) – Daniele Segato Nov 10 '16 at 11:54
  • On the `checkout -b` command, it could be clarified where your HEAD is before that. Is `temp` pointing to the same place as `master`, or where? – Wildcard Nov 10 '16 at 12:15
  • Doesn't really matter. The `git commit-tree` command is a low-level command creating a new commit with its own history. Specifically a commit with 2 parents, the ones you specified (`upstream` and `oldcommit`) – Daniele Segato Nov 10 '16 at 12:21
  • Actually you were partially right, it's useless to change branch.. I'm editing the answer – Daniele Segato Nov 10 '16 at 12:37
  • Hats off to you, sir. I've been using git for 8 years and don't come close to understand this answer, let alone find the solution myself. – Eric Duminil Nov 10 '16 at 13:34
  • You need to understand the internal structure of git to fully master it. It's actually not that hard at all when you get it, it's a DAG of commits. And everything is a file with it's hash as name. – Daniele Segato Nov 10 '16 at 13:39
1

Since Git 2.19, git range-diff provides a nice way to compare rebased histories. It works well when the new history has the same commit structure with only minor changes (but it might not help you much if the new version of the pull request both has a new base and a different breakdown into commits). It compares the patches in each commit and shows the differences, with reasonably good highlighting to distinguish between context changes and patch changes.

Suppose that a pull request has been rebased from c1 to c2, with the target branch being main and you have a local working directory with myrepo pointing to the GitHub repository with the pull request. You can download commits and compare the histories as follows:

git fetch myrepo c1 c2
git range-diff main c1 c2

Note that you need a local copy of the old tip of the pull request. I don't know how to obtain the commit ID automatically; it's what the GitHub web interface shows in “… force-pushed the mywork branch from c1 to c2”. As I write, I think GitHub always allows fetching commits by their ID even if they were only present in an old version of a pull request (in the past this was not the case). Give the commit a branch name if you don't want it to be garbage-collected from your working copy.

If there are new commits after the part you've already reviewed, a handy way to separate the rebased part from the new commits is to search for the commit message of the last commit you reviewed. In bash/zsh syntax:

git range-diff main {c1,c2}'^{/This is the commit message}'

For more complex cases (for example, if some of the old commits have lost their relevance because a different implementation of the same sub-feature has been merged), you can specify different starting points: git range-diff OLD_START..OLD_END NEW_START..NEW_END

Gilles 'SO- stop being evil'
  • 92,660
  • 35
  • 189
  • 229
0

I believe that this is simply not possible to obtain the old versions of pull requests as suggested by this ticket: https://github.com/isaacs/github/issues/999 Users who post in that repository are advised to contact GitHub support, so likely GitHub support has already replied that it was not possible when that ticket was created.

This is a major missing feature from GitHub pull requests that Gerrit has and which Gerrit users miss badly.

0

The following works, but is inefficient for large repos because the whole repo worktrees are downloaded:

repo=rossant/awesome-math
a=61e250
b=2b53ad
tmp=$(mktemp -d -p /tmp)
mkdir -p $tmp/{a,b}
curl -SL https://github.com/$repo/archive/$a.tar.gz | tar xz --strip-components=1 -C $tmp/a
curl -SL https://github.com/$repo/archive/$b.tar.gz | tar xz --strip-components=1 -C $tmp/b
git diff --no-index $tmp/{a,b}
rm -r $tmp

This could be much simpler and leaner if Github would allow git fetching revs by their SHA1, but that's not the case yet.

ens
  • 970
  • 11
  • 13