5

I have the following bash function, which searches for all files in a repository, whose filename matches a regular expression. It currently finds all commits in which a file exists. How can this be changed so it only searches among files that were edited (created, altered, or deleted) in each commit?

This was my original intention for the function. I was surprised to see the results being much broader than expected. The reason I'm trying to do this: I created a file a long time ago, and at some point between now and then, I accidentally deleted an important section from it. I want a list of all the points (commits) at which this file has changed, so I can quickly go back to the version containing the missing section, and paste it back into the current-commmit version.

:<<COMMENT
    Searches all commits in the current git repository containing a file whose name matches a regular expression.

    Usage: gitf <regex>

    Parameter is required, and must be at least one non-whitespace character.

    The original version of this function was based on the GitHub gist
    - https://gist.github.com/anonymous/62d981890eccb48a99dc
    written by Stack Overflow user Handyman5
    - https://stackoverflow.com/users/459089/handyman5
    which is based on this SO question:
    - https://stackoverflow.com/questions/372506/how-can-i-search-git-branches-for-a-file-or-directory/372654#372654

    The main section of this function was authored by Stack Overflow user
    SwankSwashbucklers.
    - https://stackoverflow.com/users/2615252/swankswashbucklers
    - https://stackoverflow.com/a/28095750/2736496

    Short description: Stored in GITF_DESC
COMMENT
#GITF_DESC: For "aliaf" command (with an 'f'). Must end with a newline.
GITF_DESC="gitf [searchterm]: Searches the current git repository for the file name that matches a regular expression.\n"

Body:

gitf()  {
    #Exit if no parameter is provided (if it's the empty string)
        param=$(echo "$1" | trim)
        echo "$param"
        if [ -z "$param" ]  #http://tldp.org/LDP/abs/html/comparison-ops.html
        then
          echo "Required parameter missing. Cancelled"; return
        fi

    wasFound="0";
    LOC=refs/remotes/origin # to search local branches only: 'refs/heads'
    ref="%(refname)"
    for branch in `git for-each-ref --format="$ref" $LOC`; do
        for commit in `git rev-list $branch | grep -oP ^.\{7\}`; do
            found=$(git ls-tree -r --name-only $commit | grep "$param")
            if [ $? -eq 0 ]; then
                echo "${branch#$LOC/}: $commit:"
                while read line; do
                    echo "  $line"
                done < <(echo "$found")
                wasFound="1";
            fi
        done
    done

    if [ "$wasFound" -eq "0" ]; then
        echo "No files in this repository match '$param'."
    fi
}
Community
  • 1
  • 1
aliteralmind
  • 18,274
  • 16
  • 66
  • 102

4 Answers4

4

Use git diff-tree -r --name-only --no-commit-id (perhaps with --stdin) instead of git ls-tree -r --name-only. Use -m or -c if you are interested in merges, -M or -C if you want to take into account respectively, rename and copy detection.

Or better parse output of git diff-tree -r.

Nb. the code given in question is seriously suboptimal (among others you check multiple times the same commits).

Jakub Narębski
  • 268,805
  • 58
  • 209
  • 228
3

If you can live with a shell glob pattern rather than a full-blown regex, consider

git log -p --diff-filter=AMD --branches --tags -- "foo*bar.sh"

With -p, you see the deltas along with the commit message, author, SHA1, etc. The --diff-filter=AMD option selects only those commits in which the files in question were Added, Modified, or Deleted. To search remotes as well as local branches and tags, use --all rather than --branches --tags. Finally, note the -- that introduces path patterns, which you will want to quote to allow git to perform glob matching.

Greg Bacon
  • 121,231
  • 29
  • 179
  • 236
  • 1
    Note that with modern Git you can use extended glob pattern, with `**` matching any number of path components (i.e. matching also '/', which is not matched by `*`). – Jakub Narębski Jan 23 '15 at 22:59
  • This prints out the body of each file. Based on the accepted answer in [this question](http://stackoverflow.com/questions/14207414/how-to-show-changed-file-name-only-with-git-log), the `--name-only` and `--oneline` options dramatically improve it: `git log -p --name-only --oneline --diff-filter=AMD --branches --tags -- "*05_*"`. – aliteralmind Jan 24 '15 at 21:21
2

You could go through and use git diff to look at what changed between each of the commits. Something like this:

for branch in `git for-each-ref --format="$ref" $LOC`;
do
    previous_commit=""
    for commit in `git rev-list $branch | grep -oP ^.\{7\}`;
    do
        if [ "$previous_commit" != "" ];
        then
            found=$(git diff --name-only $previous_commit $commit | grep "$param")
            if [ $? -eq 0 ];
            then
                echo "${branch#$LOC/}: $commit:"
                while read line;
                do
                    echo "  $line"
                done < <(echo "$found")
                echo
                wasFound="1";
            fi
        fi
        previous_commit="$commit"
    done
done
neatnick
  • 1,447
  • 1
  • 19
  • 29
1

I came up with this function, which is based on Greg Bacon's answer. I originally wanted regex, but globs fit the bill nicely. I also expected that a looping function would be required, but the single git log line is all that's needed.

First, a utility function:

#https://stackoverflow.com/questions/369758/how-to-trim-whitespace-from-bash-variable#comment21953456_3232433
alias trim="sed -e 's/^[[:space:]]*//g' -e 's/[[:space:]]*\$//g'"

Documentation header:

:<<COMMENT
   Searches all commits in the current git repository containing a file
   that has *changed*, whose name matches a glob. If the glob does not
   contain any asterisks, then it is surrounded by them on both sides.


   Usage:
      gitf "05"     #Equivalent to "*05*"
      gitf "05_*"

   Parameter is required, and must be at least one non-whitespace character.

   See:
   - https://stackoverflow.com/questions/28119379/bash-function-to-find-all-git-commits-in-which-a-file-whose-name-matches-a-rege/28120305
   - https://stackoverflow.com/questions/28094136/bash-function-to-search-git-repository-for-a-filename-that-matches-regex/28095750
   - https://stackoverflow.com/questions/372506/how-can-i-search-git-branches-for-a-file-or-directory/372654#372654

   The main "git log" line is based on this answer
   - https://stackoverflow.com/a/28119940/2736496
   by Stack Overflow user Greg Bacon
   - https://stackoverflow.com/users/123109/greg-bacon

   With thanks to SwankSwashbucklers
   - https://stackoverflow.com/users/2615252/swankswashbucklers

   Short description: Stored in GITF_DESC
COMMENT
#GITF_DESC: For "aliaf" command (with an 'f'). Must end with a newline.
GITF_DESC="gitf [glob]: Searches all commits in the current git repository containing a file that has *changed*, whose name matches a glob.\n"

Body:

gitf()  {
   #Exit if no parameter is provided (if it's the empty string)
      param=$(echo "$1" | trim)
      echo "$param"
      if [ -z "$param" ]  #http://tldp.org/LDP/abs/html/comparison-ops.html
      then
        echo "Required parameter missing. Cancelled"; return
      fi

   #https://stackoverflow.com/questions/229551/string-contains-in-bash/229606#229606
   if [[ $param != *"*"* ]]
   then
     param="*$param*"
   fi

   echo "Searching for \"$param\"..."

   git log -p --name-only --oneline --diff-filter=AMD --branches --tags -- "$param"
}

Example output:

$ gitf 05_
05_
Searching for "*05_*"...
14e5cdd Quick save (no message): 01-21-2015__14_36_11
non_django_files/wordpress_posts/templates/05_login_remember_me.html
2efdeb1 Part four final. Changed auth/tests in post to auth/tests_login_basic.
non_django_files/wordpress_posts/templates/05_login_remember_me.html
526ca01 Part four final. Renamed auth/tests to test_basic_login, so Java doesn't need to parse the py file in future par
non_django_files/wordpress_posts/templates/05_login_remember_me.html
7c227f3 Escaped unescaped dollar-signs in initial_script_sh snippet, and added delete-all-but-.git command in comment at
non_django_files/wordpress_posts/templates/05_login_remember_me.html
e68a30a Part four final, moved post output folder into wordpress_posts.
non_django_files/wordpress_posts/templates/05_login_remember_me.html
3c5e4ec Part two final. Corrections/minor changes to all posts.
non_django_files/wordpress_posts/templates/05_login_remember_me.html
3a7dac9 Finished part one.
non_django_files/wordpress_posts/templates/05_login_remember_me.html
f87540e Initial commit
non_django_files/wordpress_posts/templates/05_login_remember_me.html
Community
  • 1
  • 1
aliteralmind
  • 18,274
  • 16
  • 66
  • 102