Questions tagged [awk]

AWK is an interpreted programming language designed for text processing and typically used as a data extraction and reporting tool. AWK is used largely with Unix systems.

AWK is an interpreted programming language(AWK stands for Aho, Weinberger, Kernighan) designed for text processing and typically used as data extraction and reporting tool. It is a standard feature of most Unix-like operating systems.

Source: Wikipedia.

An awk program is a series of pattern-action pairs, written as:

condition { action }
condition { action }
...

where condition is typically an expression and action a series of one or more commands, separated by a semi-colon ; character. The input is split into records, by default separated by the newline character, and processed on a record basis (default line by line). Per record, each condition is checked and, if true, the action block is executed. If the condition is missing, the action block will be executed. If the condition is present but the action block is absent, the default action is print $0 which is to print the current line after any transformations. Since a non-zero number is equivalent to true, then awk '1' file instructs awk to perform the default action (print) for every line.

Awk can have an optional BEGIN and optional END, where the BEGIN action is invoked before reading any input, and END action is invoked after all input is read:

BEGIN     { action } 
condition { action }
condition { action }
...
END       { action }

Awk was originally developed by Alfred Aho, Brian Kernighan and Peter Weinberger in 1977 and updated in 1985. Since then, various versions and dialects of awk have emerged. The most common are :

  • awk - the most common and will be found on most Unix-like systems. It also has a well defined IEEE standard.
  • mawk - a fast AWK implementation which it's code base is based on a byte-code interpreter.
  • nawk - during the development of AWK, the developers released a new version (new awk) to avoid confusion but it is itself now very old and lacking functionality present in all POSIX awks.
  • gawk - Also known as GNU awk. The only version in which the developers attempted to add i18n support. Allowed users to write their own C shared libraries to extend it with their own "plug-ins". This version is the standard implementation for Linux.

When asking questions about data processing using awk, please include complete input and desired output.

Some frequently occurring themes:

Books:

Resources:

Other StackExchange Resources:

Related tags:

  • (GNU's version of awk)
  • (A very old, pre-POSIX version also from AT&T)
  • (A different interpreter written by Mike Brennan)
  • (A kindred tool often mentioned in the same breath)
29337 questions
759
votes
35 answers

How to do a recursive find/replace of a string with awk or sed?

How do I find and replace every occurrence of: subdomainA.example.com with subdomainB.example.com in every text file under the /home/www/ directory tree recursively?
Tedd
  • 7,635
  • 3
  • 15
  • 5
704
votes
26 answers

Find and kill a process in one line using bash and regex

I often need to kill a process during programming. The way I do it now is: [~]$ ps aux | grep 'python csp_build.py' user 5124 1.0 0.3 214588 13852 pts/4 Sl+ 11:19 0:00 python csp_build.py user 5373 0.0 0.0 8096 960 pts/6 S+ …
Orjanp
  • 9,551
  • 12
  • 33
  • 36
696
votes
19 answers

Bash tool to get nth line from a file

Is there a "canonical" way of doing that? I've been using head -n | tail -1 which does the trick, but I've been wondering if there's a Bash tool that specifically extracts a line (or a range of lines) from a file. By "canonical" I mean a program…
Vlad Vivdovitch
  • 7,727
  • 6
  • 18
  • 20
520
votes
3 answers

What is the difference between sed and awk?

What is the difference between awk and sed ? What kind of application are best use cases for sed and awk tools ?
Rachel
  • 91,207
  • 112
  • 255
  • 361
369
votes
24 answers

Using awk to print all columns from the nth to the last

This line worked until I had whitespace in the second field. svn status | grep '\!' | gawk '{print $2;}' > removedProjs is there a way to have awk print everything in $2 or greater? ($3, $4.. until we don't have anymore columns?) I suppose I…
Andy
  • 38,684
  • 13
  • 64
  • 66
338
votes
8 answers

How do I use shell variables in an awk script?

I found some ways to pass external shell variables to an awk script, but I'm confused about ' and ". First, I tried with a shell script: $ v=123test $ echo $v 123test $ echo "$v" 123test Then tried awk: $ awk 'BEGIN{print "'$v'"}' $ 123test $ awk…
hqjma
  • 3,383
  • 3
  • 11
  • 5
299
votes
19 answers

How can I shuffle the lines of a text file on the Unix command line or in a shell script?

I want to shuffle the lines of a text file randomly and create a new file. The file may have several thousands of lines. How can I do that with cat, awk, cut, etc?
Ruggiero Spearman
  • 6,165
  • 5
  • 24
  • 36
287
votes
1 answer

How to remove double-quotes in jq output for parsing json files in bash?

I'm using jq to parse a JSON file as shown here. However, the results for string values contain the "double-quotes" as expected, as shown below: $ cat json.txt | jq '.name' "Google" How can I pipe this into another command to remove the ""? so I…
Chris F
  • 8,159
  • 20
  • 61
  • 121
283
votes
8 answers

How can I use ":" as an AWK field separator?

Given the following command, echo "1: " | awk '/1/ -F ":" {print $1}' why does AWK output: 1: ?
user173446
  • 2,881
  • 2
  • 14
  • 5
263
votes
5 answers

What are the differences between Perl, Python, AWK and sed?

just want to know what are the main differences among them? and the power of each language (where it's better to use it). Edit: it's not "vs." like topic, just information.
Khaled Al Hourani
  • 3,029
  • 3
  • 18
  • 17
255
votes
6 answers

AWK: Access captured group from line pattern

If I have an awk command pattern { ... } and pattern uses a capturing group, how can I access the string so captured in the block?
rampion
  • 82,104
  • 41
  • 185
  • 301
241
votes
12 answers

How to show only next line after the matched one?

grep -A1 'blah' logfile Thanks to this command for every line that has 'blah' in it, I get the output of the line that contains 'blah' and the next line that follows in the logfile. It might be a simple one but I can't find a way to omit the line…
facha
  • 10,420
  • 13
  • 54
  • 76
222
votes
7 answers

Using multiple delimiters in awk

I have a file which contain following lines: /logs/tc0001/tomcat/tomcat7.1/conf/catalina.properties:app.env.server.name = demo.example.com /logs/tc0001/tomcat/tomcat7.2/conf/catalina.properties:app.env.server.name =…
Satish
  • 13,709
  • 26
  • 80
  • 130
211
votes
33 answers

How can I quickly sum all numbers in a file?

I have a file which contains several thousand numbers, each on it's own line: 34 42 11 6 2 99 ... I'm looking to write a script which will print the sum of all numbers in the file. I've got a solution, but it's not very efficient. (It takes several…
Mark Roberts
  • 2,829
  • 3
  • 22
  • 13
202
votes
10 answers

How to split a delimited string into an array in awk?

How to split the string when it contains pipe symbols | in it. I want to split them to be in array. I tried echo "12:23:11" | awk '{split($0,a,":"); print a[3] a[2] a[1]}' Which works fine. If my string is like "12|23|11" then how do I split them…
Mohamed Saligh
  • 11,048
  • 17
  • 61
  • 83
1
2 3
99 100