0

I followed this post and I want to get all of the lines of code which were changed, except comments! .

Sample of comments, i want to ignore:

* @copyright Copyright (c) 2006-2017 X.commerce, Inc. and affiliates (http://www.magento.com)

was changed into:

* @copyright Copyright (c) 2006-2018 Magento, Inc. (http://www.magento.com)

Using this command:

`git diff -G [^* @copyright] > diff.txt` , it outputs:
fatal: ambiguous argument '@copyright]': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

How can I get a list of files where other parts of the code were changed , not just only the copyright part ? Thank you

Attila Naghi
  • 2,326
  • 4
  • 29
  • 52

2 Answers2

1

TL;DR: -G doesn't quite do what you want, but see Regular expression to match a line that doesn't contain a word? and note that -G uses POSIX "extended regular expression" syntax.

Long

You have two different problems to solve (plus some unhelpful documentation from Git). One of these problems is writing a regular expression that selects what you want. RomainVALERI's answer is aimed at this (but misses, unfortunately), ignoring the other problem: -G won't quite do what you want. Let me tackle that one first.

The git diff and git log commands (which are different commands entirely) share some source code, and some documentation. Both -G and -S activate what Git calls the pickaxe, which is documented in a separate page called gitdiffcore.

When we run git log, we have a common problem: it shows us too many commits. We want to see fewer commits! In particular, we want to see commits that affected some particular part(s) of our code. This is where -G and -S are the most useful: git log -G<regex> or git log -S<text> find commits where, if we run git show on those commits, we'd see a change that affects the code we asked to find.

Remember, git show <hash> runs, more or less, git diff <hash>^ <hash> to compare the snapshot in the parent of one specific commit to the snapshot in the child. That is, each commit stores a full snapshot, so by comparing what's in the parent with what's in the child, we can find out what changed in that commit.1 Once we know what changed, then we can search through that for some particular change.

So, back to git log for a bit: we might want to find out which commit(s) changed the copyright date, to see if they changed anything else too. So we'd have git log -S or git log -G find and list/show all the commits that have a line containing the word copyright in their diffs. That's what these options are particularly good for. But that's not what you're trying to do.


1It's a bit like going through a chart of the high temperature each day for the last week, and finding out which day the temperature changed the most: we have to subtract day-pairs; knowing it was 24°C on Monday and 18°C on Wednesday tells us nothing about what it was on Tuesday, but if it was 19°C Tuesday, now we know it changed by 5 from Mon to Tue, and then by 1 from Tue to Wed. Likewise, with snapshots of source code, we have to subtract Monday's snapshot from Tuesday's, to see what changed on Tuesday. We have to subtract Tuesday's from Wednesday's to see what changed on Wednesday.


Now, when we run git diff on any two specific commits, like this:

git diff <old-commit> <new-commit>

Git effectively subtracts the old-commit from the new-commit to tell us what's different. But a commit is made up of files. There might be 10,000 files in each of the two commits. The diff will show us what changed in every file, and that might be 1% of the files, or 100 files—but we might be interested only in files in which the diff contains the word "copyright", for instance.

This is what the pickaxe (-G or -S) does when used with git diff: if git diff was going to show us 100 files with differences, the pickaxe selects only those files that contain the specific change we ask it to look for. So if git diff would have shown us 100 files, but only ten of the files have the word "copyright" in their diff output, git diff -G copyright <old> <new> would show us those ten files.

In your case, once you come up with a regular expression that matches things that aren't "copyright", you have a way to tell Git: Find diffs that changed something other than text after the word copyright. If you use this with git diff, that will show you the full diff—including the copyright change!—of files that changed a line that doesn't contain the word "copyright".

That's not what you said you wanted. It may have to be good enough, though, because that's what Git can give you.

Now, let's go back to finding a line that contains a change that does not contain the word "@copyright", perhaps after a literal asterisk and space. This is harder than it looks! I'll outsource this to an existing question in the arena, and its many answers: Regular expression to match a line that doesn't contain a word? Git uses POSIX extended regular expressions, or EREs. These do not have negative lookaround, so the method in the first comment or this answer will do the job, but the accepted answer won't.

torek
  • 330,127
  • 43
  • 437
  • 552
  • 1) Thank you so much. 2) Yes I know we're to refrain ourselves from these non-productive comments but it was just *too* hard to just leech again silently... I'll delete this tomorrow. – RomainValeri Sep 25 '18 at 17:27
0

Try

git diff -G"(^\\\* @copyright)"

When the string is parsed for regexp, \\ becomes a single \, and \* becomes a *.

RomainValeri
  • 14,254
  • 2
  • 22
  • 39
  • Hmm, but this way , I don't get any results . And I have at least 300 files where this comments was changed + other changes which were in the code(which I want to get) – Attila Naghi Sep 19 '18 at 09:07