0

I'm trying to build a regex to match any .c or .h pattern whithin modified file of git status -s.

(Find the regex here: regexr.com/40afj )

(?:M).*.[c|h]

to be used on this king of data:

M  fjdkls/fjdkslm/djks.c
M  fjdkls/fjdkslm/djks.c
M  fjdkls/fjdkslm/djks.h
M  fjdkls/fjdkslm/djks.h
??  fjdkls/fjdkslm/djks.c
??  fjdkls/fjdkslm/djks.c

Can you explain me why the M is matching even if it is in a non-capturing group?

The expected result is to match full path of modified file.

jota
  • 23
  • 5
  • What do you expect it to do? Do you know what a (non-)capturing group is? – Biffen Sep 28 '18 at 10:06
  • 1
    Looks like you need `^M(.*\.[ch])$`, see https://regexr.com/40ah3. Note that *caprturing* still *matches*, consumes, i.e. adds the matched text to the overall match value. – Wiktor Stribiżew Sep 28 '18 at 10:06
  • (OT: If you want a list of modified files, you can use `git diff --name-only --diff-filter=M HEAD` instead of parsing `git status`.) – Biffen Sep 28 '18 at 10:11
  • Thanks Biffen that solve the problem, but I'm still curious about capturing group. I have misudunderstand capturing group after reading this https://stackoverflow.com/questions/3512471/what-is-a-non-capturing-group-what-does-do how can I retrieve only one group with grep for example? – jota Sep 28 '18 at 10:23
  • Don't parse the output of `git status -s`, but rather the one of `git status --porcelain` which is guaranteed to remain stable in the future (but other than that essentially the same). – Matthieu Moy Sep 28 '18 at 10:37

2 Answers2

1

Non Capturing group means that it isn't captured when regexp matches. Here the only group that you need is the one that matches file path:

[ ADMRCU?!]{2} (.+\.[ch])

With such regexp the only captured group will contain file path. Moreover what you should use is --porcelain instead of -s. --porcelain is meant to be more stable and it should be used for scripts.

Marcin Pietraszek
  • 2,779
  • 1
  • 14
  • 29
0

It is only matching the ones beginning with M because the Regex is specifically looking for M.

If you add a quantifier(?) after the first group that should work.

(?:M)?.*.[c|h]

Non-capturing groups group tokens together without creating a capture group, they do not look over the contents. If you want to capture everything except the M then you need to put everything in a capture group.

(?:.*?)[^\s]+(.*\.[c|h])

This will match but not capture the group containing M and capture the full path excluding the non-capturing group.

TheLaj
  • 3
  • 4