0

i want to cut textfile using "grep/awk" or others similar tools, for example "grep Proxy data.txt" then i got "15988", The number obtained from the paragraph that is associated with the pattern that I want to find (this is actually port number). any port expanded distinguished with others by using "============" character

somewhat similar, but I became dizzy How can I make 'grep' show a single line five lines above the grepped line?

http://prntscr.com/c1oq06

==============NEXT SERVICE FINGERPRINT (SUBMIT INDIVIDUALLY)==============
SF-Port15988-TCP:V=7.01%I=7%D=8/3%Time=57A107C3%P=i586-pc-linux-gnu%r(GetR
SF:equest,24E,"HTTP/1\.1\x20404\x20Not\x20found\r\nConnection:\x20close\r\
SF:nDate:\x20Tue,\x2002\x20Aug\x202016\x2020:44:05\x20GMT\r\nContent-Type:
SF:\x20text/html\r\nContent-Length:\x20407\r\nExpires:\x200\r\nCache-Contr
SF:ol:\x20no-cache\r\nPragma:\x20no-cache\r\n\r\n<!DOCTYPE\x20HTML\x20PUBL
SF:IC\x20\"-//W3C//DTD\x20HTML\x204\.01\x20Transitional//EN\"\x20\"http://
SF:www\.w3\.org/TR/html4/loose\.dtd\">\n<html><head>\n<title>Proxy\x20erro
SF:r:\x20404\x20Not\x20found\.</title>\n</head><body>\n<h1>404\x20Not\x20f
SF:ound</h1>\n<p>The\x20following\x20error\x20occurred\x20while\x20trying\
SF:x20to\x20access\x20<strong>/</strong>:<br><br>\n<strong>404\x20Not\x20f
SF:ound</strong></p>\n<hr>Generated\x20Wed,\x2003\x20Aug\x202016\x2003:44:
SF:05\x20WIT\x20by\x20rpc\x20on\x20<em>sevdev:15988</em>\.\n</body></html>
SF:\r\n")%r(HTTPOptions,258,"HTTP/1\.1\x20501\x20Method\x20not\x20implemen
SF:ted\r\nConnection:\x20close\r\nDate:\x20Tue,\x2002\x20Aug\x202016\x2020
SF::44:05\x20GMT\r\nContent-Type:\x20text/html\r\nContent-Length:\x20404\r
SF:\nExpires:\x200\r\nCache-Control:\x20no-cache\r\nPragma:\x20no-cache\r\
SF:n\r\n<!DOCTYPE\x20HTML\x20PUBLIC\x20\"-//W3C//DTD\x20HTML\x204\.01\x20T
SF:ransitional//EN\"\x20\"http://www\.w3\.org/TR/html4/loose\.dtd\">\n<htm
SF:l><head>\n<title>Proxy\x20error:\x20501\x20Method\x20not\x20implemented
SF:\.</title>\n</head><body>\n<h1>501\x20Method\x20not\x20implemented</h1>
SF:\n<p>The\x20following\x20error\x20occurred:<br><br>\n<strong>501\x20Met
SF:hod\x20not\x20implemented</strong></p>\n<hr>Generated\x20Wed,\x2003\x20
SF:Aug\x202016\x2003:44:05\x20WIT\x20by\x20rpc\x20on\x20<em>sevdev:15988</
SF:em>\.\n</body></html>\r\n")%r(RTSPRequest,26C,"HTTP/1\.1\x20400\x20Erro
SF:r\x20in\x20first\x20request\x20line\r\nConnection:\x20close\r\nDate:\x2
SF:0Tue,\x2002\x20Aug\x202016\x2020:44:05\x20GMT\r\nContent-Type:\x20text/
SF:html\r\nContent-Length:\x20419\r\nExpires:\x200\r\nCache-Control:\x20no
SF:-cache\r\nPragma:\x20no-cache\r\n\r\n<!DOCTYPE\x20HTML\x20PUBLIC\x20\"-
SF://W3C//DTD\x20HTML\x204\.01\x20Transitional//EN\"\x20\"http://www\.w3\.
SF:org/TR/html4/loose\.dtd\">\n<html><head>\n<title>Proxy\x20error:\x20400
SF:\x20Error\x20in\x20first\x20request\x20line\.</title>\n</head><body>\n<
SF:h1>400\x20Error\x20in\x20first\x20request\x20line</h1>\n<p>The\x20follo
SF:wing\x20error\x20occurred:<br><br>\n<strong>400\x20Error\x20in\x20first
SF:\x20request\x20line</strong></p>\n<hr>Generated\x20Wed,\x2003\x20Aug\x2
SF:02016\x2003:44:05\x20WIT\x20by\x20rpc\x20on\x20<em>sevdev:15988</em>\.\
SF:n</body></html>\r\n");

i have done

 grep "Proxy" aaa.txt

the result (i know what i do):

 SF: www \ .w3 \ .org / TR / HTML4 / loose \ .dtd \ "> \ n <html> <head> \ n <title> Proxy \ x20erro
 SF: l> <head> \ n <title> Proxy \ x20error: \ x20501 \ x20Method \ x20not \ x20implemented
 SF: org / TR / HTML4 / loose \ .dtd \ "> \ n <html> <head> \ n <title> Proxy \ x20error: \ x20400

just wordering how to cut/grep so the result become like this :

15988
15988
15988

Thanks,

Community
  • 1
  • 1
  • 4
    Read [ask] then [edit] your question to provide the [mcve] including concise, testable sample input and expected output plus what you have attempted so far. Important - use text, not images, for examples so we have something to test a potential solution against. – Ed Morton Aug 04 '16 at 19:49

2 Answers2

2

In general, to get the last line that satisfies ConditionA prior to a line that satisfies ConditionB, you could do:

awk 'ConditionA {last=$0} ConditionB {print last}'

If ConditionA and ConditionB could be true at the same time, you need to define more exactly how you want to deal with that case.

But, looking at the sample input, it seems that in your case:

  • ConditionA could be "does not start with SF:" which is !/^SF:/ in awk.

  • ConditionB could be "starts with SF: and contains the string Proxy" which is /^SF:.*Proxy/ in awk

Putting these together, you get:

awk '!/^SF:/ {last=$0} /^SF:.*Proxy/ {print last}'

Which you could rewrite as:

awk '!/^SF:/ {last=$0; next} /Proxy/ {print last}'
Matei David
  • 2,152
  • 3
  • 22
  • 34
1

Is this what you're trying to do?

$ awk 'match($0,/^SF-Port([0-9]+).*/,a){port=a[1]} /Proxy/{print port}' file
15988
15988
15988

The above uses GNU awk for the 3rd arg to match(), with other awks it'd be a couple of sub()s or similar to isolate the port number, e.g.:

$ awk 'sub(/^SF-Port/,""){sub(/[^0-9].*/,""); port=$0} /Proxy/{print port}' file
15988
15988
15988

or:

$ awk 'match($0,/^SF-Port[0-9]+/){port=substr($0,8,RLENGTH-7)} /Proxy/{print port}' file
15988
15988
15988
Ed Morton
  • 157,421
  • 15
  • 62
  • 152