0

Some lines has been deleted from a file, and I'd like to know when and how it happened.

But: no faulty commit to be found. Really strange.

Here is git log, that shows (as last changes) the commit when the lines were added:

$ git log --follow -p -- src/QsGeneralBundle/Repository/EmployeeRepository.php 
commit 80dc439aba2bf38ed7e95888dd21c1ba0f8960ce
Author: Benoit <benoit.guchet@gmail.Com>
Date:   Tue Jul 16 18:31:38 2019 +0200

    Opti Doctrine #814

diff --git a/src/QsGeneralBundle/Repository/EmployeeRepository.php b/src/QsGeneralBundle/Repository/EmployeeRepository.php
index 7dfc2f963..a3e916690 100644
--- a/src/QsGeneralBundle/Repository/EmployeeRepository.php
+++ b/src/QsGeneralBundle/Repository/EmployeeRepository.php
@@ -10,12 +10,15 @@ namespace QsGeneralBundle\Repository;
  */
 class EmployeeRepository extends BaseRepository
 {
-       public function findAllQb() {
-               $qb = $this->createQueryBuilder('e')
+    public function findAllQb() {
+        $qb = $this->createQueryBuilder('e')
             ->innerJoin('e.Account', 'a')
-                       ->orderBy('a.firstname', 'ASC')
-                       ->addOrderBy('a.lastname', 'ASC')
-            ->andWhere('a.isEnabled = TRUE');
+            ->leftJoin('a.SpecificRole', 'r')
+            ->leftJoin('a.Contractor', 'c')
+            ->orderBy('a.firstname', 'ASC')
+            ->addOrderBy('a.lastname', 'ASC')
+            ->andWhere('a.isEnabled = TRUE')
+            ->select('e, a, r, c');

                return $qb;
        }

Now here is the current content of the file:

<?php

namespace QsGeneralBundle\Repository;

/**
 * Created by IntelliJ IDEA.
 * User: Benoît Guchet
 * Date: 16/11/2016
 * Time: 13:26
 */
class EmployeeRepository extends BaseRepository
{
    public function findAllQb() {
        $qb = $this->createQueryBuilder('e')
            ->innerJoin('e.Account', 'a')
            ->orderBy('a.firstname', 'ASC')
            ->addOrderBy('a.lastname', 'ASC')
            ->andWhere('a.isEnabled = TRUE');

        return $qb;
    }
}

Which has no local changes as shown by git status:

$ git status src/QsGeneralBundle/Repository/EmployeeRepository.php 
On branch master
Your branch is ahead of 'origin/master' by 11 commits.
  (use "git push" to publish your local commits)
nothing to commit, working tree clean

Any piece of mind about this mystery?

EylM
  • 5,196
  • 2
  • 13
  • 24
theredled
  • 910
  • 8
  • 19

1 Answers1

1

Assuming src/QsGeneralBundle/Repository/EmployeeRepository.php has never been renamed (or not in the range of commits that will be displayed), you can omit the --follow and add, instead, --full-history and -m.

There are two problems here:

  • First, the change you are looking for probably occurred during a merge. Without -m (or some other arguments that also aren't the default), git log ignores all changes committed during a merge.

  • Second, when you specify a file name such as src/QsGeneralBundle/Repository/EmployeeRepository.php (with or without --follow), Git turns on History Simplification. When history simplification is on, Git simply doesn't bother to look at some commits. In particular it won't follow the leg of a merge that (incorectly—but see footnote) doesn't take a change that you think the merge should take.

Hence by forcing the history simplification to keep all commits (--full-history) and forcing git log to show merge commits as two git diffs, you should be able to find where the commit you want was (incorrectly) dropped.


Footnote: Git assumes that the result of each merge is the correct result. That is, suppose a merge commit has, as its right-side incoming-branch changes, a bug fix, and as its left-side changes, no changes at all. Suppose further that whoever made the merge dropped the bug fix:

o another commit (has bug)
|
* merge commit (has bug, person who merged dropped the fix)
|\
o | main line work (has bug)
| |
| o fix for bug
|/
o main line work (has bug)

Git assumes that the right-side changes—the bug fix—were instead the introduction of a bug, which the person who made the merge was smart and omitted, keeping the left-side version that (according to the person doing the merge, and hence now according to Git too) was "correct". In fact, that merge commit was wrong ... but when Git does History Simplification, Git assumes that it was right. So Git follows only the left side of the commit graph at this point. You never see the bug fix!

The only way to see the commit with the fix is to avoid or disable History Simplification (leave out the file name, or add --full-history). But since by default, git log -p won't diff the merge commit against the bug-fix commit, even that's not sufficient.


Additional footnote: note that file names, in Git, are the entire path. If a file was named a/b/c.ext and is now named a/d/c.ext, this one file has two different names. Git does not store directories / folders, which is why you cannot store an empty directory. (Side note to the footnote: there is an empty tree object in every repository, but using it leads to problems later as Git keeps trying to convert this into a submodule—see the linked question for details.)

The way Git deals with your computer's insistence on using directories / folders—which is what allows your OS to provide empty ones—is to coddle your OS. Git just extracts the files for a commit, and if your OS demands that a file whose Git-name is a/b/c.ext be created as file c.ext in folder b in folder a, well, then, Git creates folder a and then folder a/b so as to create a file whose name is a/b/c.ext. If there are no files whose name starts with a/b, Git just doesn't create a/b in the first place.

In any case, the fundamental problem with --follow is that its implementation is a cheesy hack. Suppose that history—i.e., the backwards-looking chain of commits in the repository—looks something like this, with later commits towards the right:

...--I--J
         \
          M--N   <-- branch
         /
...--K--L

where each letter stands in for an actual commit hash. In commit N you have a file named a/d/c.ext; in commit I, that same file has name a/b/c.ext. The way that git log --follow a/d/c.ext handles this is to look at each commit-pair, one step at a time:

  • First, Git looks at the M-N pair.

    Is the file named a/b/c.ext in M, and a/d/c.ext in N? If so, then as soon as Git works on the J-M and L-M pairs, Git will stop looking for a/d/c.ext and start, instead, looking for a/b/c.ext (the old name, not the current nam).

    If not—if it's still a/d/c.ext in M—then when Git compares J and M, or compares L and M, it will look for a/d/c.ext (the current name, not the old name).

  • Next, Git looks at either the J-M or L-M pair—or, if doing History Simplification and not forced not to, picks one of the pair to look at and doesn't really look at the other pair, nor follow that leg of the branch.

    If the file is renamed in one of these pairs, but not the other, Git really can't handle this. Without --full-history Git will pick one leg to follow and, to some extent, pretend there isn't a merge and the rename detection can work (although I'm not sure it does, given the other oddities that git log displays around merges). If the file isn't renamed in either pair, though, we're still good: the file still has its new name, so if we force --full-history, so that Git must now traverse both legs of the merge, Git will do that. Let's say it now does the I-J part:

  • Now Git compares commits I and J. Was the file renamed here? Let's say it was renamed here. In effect, Git says to itself: Aha, let's stop looking for a/d/c.ext; from now on, let's look for a/b/c.ext. When Git compares commit I with whatever comes before commit I, that works.

  • But, having followed that leg of the merge for a while, Git now—under --full-history anyway—goes back and compares commits K and L, from the other leg of the merge. Git goes to check on file a/b/c.ext, as that's the path it now believes the file has. But the file still has the a/d/c.ext name in this leg of the merge. The result is that git log --follow is looking for the wrong file!

To make this all work, git log --follow would need to be way more clever than it really is. All --follow does is change the one file name it is looking for, and it does not remember that it did so in some particular part of the commit graph. Git needs to somehow tie the name change to the graph traversal, and "undo" the change whenever it goes back "up" the graph (but Git doesn't really go up: it really just puts commits into a priority queue, which means there is no place to put this name-change information).

If all the name changes happen in the "right" places, --follow and --full-history can mix well. For instance, if the name-change to a/d/c.ext from the old name a/b/c.ext happens in the M-N pair, so that all the older commits have the old name—then --follow and --full-history have no problem following both legs of the merge: the file has only the one name from M on back in history and the --follow changes the one name at the right time. It's when the name-change happens in parallel histories, in multiple different merge legs, that this screws up.

torek
  • 330,127
  • 43
  • 437
  • 552
  • Strangely enough, it was never renamed (only a parent subdirectory years ago), but `git log` doesn't even include that last changes commit while `git log --follow` does... – theredled Jul 20 '19 at 10:50
  • Anyway thanks, adding `--full-history -m` indeed shows me that a merge is to blame. Git did it automatically :/ can I trust git auto merges? Or must I always recheck everything? – theredled Jul 20 '19 at 10:54
  • It did the same for a lot of files – theredled Jul 20 '19 at 11:23
  • Git doesn't store directories, Git only stores files. If the file's name was `a/b/c` once and is now `a/d/c`, the file has been renamed: the file's name is the full string. That's not a directory, it's just part of the file name! – torek Jul 20 '19 at 16:54
  • As for what happened in the merge, well, you can *repeat* the merge to see if it really was Git that did this; and/or you can look at the two diffs from the three commits, to see what Git saw that might make Git do this. My bet is that Git declared a conflict during merge and someone used either `-X` or `-s` to "fix" the conflict by ignoring one side of the merge (or did the equivalent in a merge tool). – torek Jul 20 '19 at 16:55