1
# cat a.file
123123abc file2
274efcc4d8a0e8f61397be46bd4bfa37  file1
321  file1

# file="file1"
# grep -m 1 -oP "\K([\w]+) +$file" a.file
274efcc4d8a0e8f61397be46bd4bfa37  file1

How can I get the output 274efcc4d8a0e8f61397be46bd4bfa37 with the parameter file1 line, and only use grep command without | or other command like awk?

Is there grep -P have some other parameters for only print the match pattern \K([\w]+) such as the result is 274efcc4d8a0e8f61397be46bd4bfa37? Or Is there any implementation to get the result with grep only.

Shawn
  • 28,389
  • 3
  • 10
  • 37
VictorLee
  • 328
  • 2
  • 11

2 Answers2

5
$ file='file1'
$ grep -m1 -oP '\w+(?= +'"$file"'$)' a.file
274efcc4d8a0e8f61397be46bd4bfa37

(?=regexp) is a positive lookahead construct which helps you define an assertion without it being part of the matched portion. See Reference - What does this regex mean? for more information on lookarounds.

I've also placed '\w+(?= +' and "$file" and '$)' next to each other, so that only the shell variable is under double quotes. The $ anchor is used here to avoid partial match, for example file1 vs file1a or file11. If your filename can have regexp metacharacters (for ex: .), then you'll need to use '\w+(?= +\Q'"$file"'\E$)' to match the filename literally.


Not sure why you do not want to use awk here, it is well suited for this task. String comparison comes in handy to avoid having to deal with regexp escaping.

awk -v f="$file" '$2==f{print $1; exit}'
Sundeep
  • 19,273
  • 2
  • 19
  • 42
0

You misuse \K, it discards the matched text captures until current moment, it has no use at the beginning of the pattern. +file1 is consuming pattern, it will be returned as a match part.

Use non-consuming pattern:

grep -m 1 -oP "\w+(?=\s+$file\b)" a.file

See regex proof. \b will disallow matching file10.

EXPLANATION

--------------------------------------------------------------------------------
  \w+                      word characters (a-z, A-Z, 0-9, _) (1 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (?=                      look ahead to see if there is:
--------------------------------------------------------------------------------
    \s+                      whitespace (\n, \r, \t, \f, and " ") (1
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    file1                    'file1'
--------------------------------------------------------------------------------
    \b                       the boundary between a word char (\w)
                             and something that is not a word char
--------------------------------------------------------------------------------
  )                        end of look-ahead
Ryszard Czech
  • 10,599
  • 2
  • 12
  • 31