0

I have an egrep that does a good job at extracting all GET /admin/hb records from a Jetty Access Log

egrep '^.*? ".+? /admin/hb .*?".*?$' /m1/logs/ap*access*2013_03_19.log

I would now like to get all the lines that aren't "GET /admin/hb". Its simple enough with egrep -v...

egrep -v '^.*? ".+? /admin/hb .*?".*?$' /m1/logs/ap*access*2013_03_19.log

...but I will ultimately be putting this expresion into a Groovy script and would like to know how to negate the "/admin/hb" part. My weak attempt at negative lookaround failed; it matches no lines at all.

egrep '^.*? ".+? ^(?!/admin/hb) .*?".*?$' /m1/logs/ap*access*2013_03_19.log

How can I get egrep to produce all the access log lines that don't match /admin/hb?

The test data set follows. I expect the solution to skip the first line, but match the next two:

127.0.0.1 -  -  [20/Mar/2013:16:37:08 +0000] "GET /admin/hb HTTP/1.1" 200 105  4
10.23.68.60 -  -  [20/Mar/2013:16:37:08 +0000] "GET /$PIT$/AUS/admin/hb HTTP/1.1" 200 0  4
10.23.68.64 -  -  [20/Mar/2013:16:36:47 +0000] "GET /handsets/dmhc HTTP/1.1" 200 0  1
Bob Kuhar
  • 9,765
  • 9
  • 56
  • 101
  • Hmmm. This answer is related but I can't figure out how to integrate it: http://stackoverflow.com/questions/406230/regular-expression-to-match-string-not-containing-a-word – Bob Kuhar Mar 20 '13 at 18:04

1 Answers1

2

Does this work with your version of grep?

grep -P '^.*? "\S+?(?! /admin/hb) .*?".*?$' groovy
10.23.68.60 -  -  [20/Mar/2013:16:37:08 +0000] "GET /$PIT$/AUS/admin/hb HTTP/1.1" 200 0  4
10.23.68.64 -  -  [20/Mar/2013:16:36:47 +0000] "GET /handsets/dmhc HTTP/1.1" 200 0  1
tink
  • 11,396
  • 4
  • 38
  • 45
  • That does work, thanks. But I'm left scratching my head why the same expression doesn't work through my egrep. Hmmm. – Bob Kuhar Mar 20 '13 at 22:21
  • Because ?! is a perlism, and using -P invokes the perlre personality of grep. – tink Mar 20 '13 at 22:23