awk is both a programming language and text processor that you can use to manipulate text data in very useful ways. In this guide, you’ll explore how to use the awk command line tool and how to use it to process text.,You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!,Linux utilities often follow the Unix philosophy of design. Tools are encouraged to be small, use plain text files for input and output, and operate in a modular manner. Because of this legacy, we have great text processing functionality with tools like sed and awk.,You can use the BEGIN and END blocks to print information about the fields you are printing. Use the following command to transform the data from the file into a table, nicely spaced with tabs using \t:
The basic format of an awk
command is:
awk '/search_pattern/ { action_to_take_on_matches; another_action; }'
file_to_parse
Create a favorite_food.txt
file which lists the favorite foods of a group of friends:
echo "carrot sandy wasabi luke sandwich brian salad ryan spaghetti jessica " > favorite_food.txt
Now use the awk
command to print the file to the screen:
awk '{print}'
favorite_food.txt
This isn’t very useful. Let’s try out awk
’s search filtering capabilities by searching through the file for the text “sand”:
awk '/sand/'
favorite_food.txt
- awk '/sand/' favorite_food.txt
Outputcarrot sandy sandwich brian
Awk's basic syntax is:
awk[options]
'pattern {action}'
file
To get started, create this sample file and save it as colours.txt
name color amount
apple red 4
banana yellow 6
strawberry red 3
grape purple 10
apple green 8
plum purple 2
kiwi brown 4
potato brown 9
pineapple yellow 5
In awk, the print function displays whatever you specify. There are many predefined variables you can use, but some of the most common are integers designating columns in a text file. Try it out:
$ awk '{print $2;}'
colours.txt
color
red
yellow
red
purple
green
purple
brown
brown
yellow
Regular expressions work as well. This conditional looks at $2 for approximate matches to the letter p followed by any number of (one or more) characters, which are in turn followed by the letter p:
$ awk '$2 ~ /p.+p/ {print $0}'
colours.txt
grape purple 10
plum purple 2
Numbers are interpreted naturally by awk. For instance, to print any row with a third column containing an integer greater than 5:
awk '$3>5 {print $1, $2}'
colours.txt
name color
banana yellow
grape purple
apple green
potato brown
The first sed removed the brackets and braces. The second sed removes the double-quotes. The awk command parses the line by comma delimiters and then parses each line by the semi-colon delimiter and if the first variable $1 is equal to the jobState value then print the second $2 variable.,Find jobState. Print the second argument, and remove the double-quotes.,The above methods will be used within the sample scripts since they use the native Linux tools. They typically do not require you to load extra packages or libraries onto the system.,If the results contain an array of values, then you need to loop through each set and parse out the desired value. For example,
json = '{"type":"OKResult","status":"OK","result":{"type":"Job","reference":"JOB-53","namespace":null,"name":null,"actionType":"DB_SYNC","target":"ORACLE_DB_CONTAINER-9","targetObjectType":"OracleDatabaseContainer","jobState":"RUNNING","startTime":"2016-08-12T19:58:59.811Z","updateTime":"2016-08-12T19:58:59.828Z","suspendable":true,"cancelable":true,"queued":false,"user":"USER-2","emailAddresses":null,"title":"Run SnapSync for database \"VDPXDEV1\".","percentComplete":0.0,"targetName":"Oracle_Source/VDPXDEV1","events":[{"type":"JobEvent","timestamp":"2016-08-12T19:58:59.840Z","state":null,"percentComplete":0.0,"messageCode":"event.job.started","messageDetails":"DB_SYNC job started for \"Oracle_Source/VDPXDEV1\".","messageAction":null,"messageCommandOutput":null,"diagnoses":[],"eventType":"INFO"}],"parentActionState":"WAITING","parentAction":"ACTION-238"},"job":null,"action":null}' echo $json | sed - e 's/[{}]/' '/g' | awk - v RS = ',' - F: '{print $1 $2}' "type" "OKResult" "status" "OK" "result" "type" "reference" "JOB-53" "namespace" null "name" null "actionType" "DB_SYNC" "target" "ORACLE_DB_CONTAINER-9" "targetObjectType" "OracleDatabaseContainer" "jobState" "RUNNING" "startTime" "2016-08-12T19 "updateTime" "2016-08-12T19 "suspendable" true "cancelable" true "queued" false "user" "USER-2" "emailAddresses" null "title" "Run SnapSync for database \"VDPXDEV1\"." "percentComplete" 0.0 "targetName" "Oracle_Source/VDPXDEV1" "events" ["type" "timestamp" "2016-08-12T19 "state" null "percentComplete" 0.0 "messageCode" "event.job.started" "messageDetails" "DB_SYNC job started for \"Oracle_Source/VDPXDEV1\"." "messageAction" null "messageCommandOutput" null "diagnoses" [] "eventType" "INFO" ] "parentActionState" "WAITING" "parentAction" "ACTION-238" "job" null "action" null
echo $json | sed - e 's/[{}]/' '/g' | sed s / \"//g | awk -v RS=',' -F: '$1==" jobState "{print $2}' RUNNING
json = ' { "type": "ListResult", "status": "OK", "result": [{ "type": "WindowsHostEnvironment", "reference": "WINDOWS_HOST_ENVIRONMENT-1", "namespace": null, "name": "Window Target", "description": "", "primaryUser": "HOST_USER-1", "enabled": false, "host": "WINDOWS_HOST-1", "proxy": null }, { "type": "UnixHostEnvironment", "reference": "UNIX_HOST_ENVIRONMENT-3", "namespace": null, "name": "Oracle Target", "description": "", "primaryUser": "HOST_USER-3", "enabled": true, "host": "UNIX_HOST-3", "aseHostEnvironmentParameters": null }], "job": null, "action": null, "total": 2, "overflow": false } '
SOURCE_ENV = "Oracle Target" lines = `echo ${json} | cut -d "[" -f2 | cut -d "]" -f1 | awk -v RS='},{}' -F: '{print $0}' ` while read - r line do #echo "Processing $line" #echo $line | sed - e 's/[{}]/' '/g' | sed s / \"//g | awk -v RS=',' -F: '$1==" name "{print $2}' TMPNAME = `echo $line | sed -e 's/[{}]/''/g' | sed s/\"//g | awk -v RS=',' -F: '$1=="name"{print $2}' ` #echo "Name: |${TMPNAME}| |${SOURCE_ENV}|" if [ ["${TMPNAME}" == "${SOURCE_ENV}"] ] then echo $line | sed - e 's/[{}]/' '/g' | sed s / \"//g | awk -v RS=',' -F: '$1==" primaryUser "{print $2}' PRI_USER = `echo $line | sed -e 's/[{}]/''/g' | sed s/\"//g | awk -v RS=',' -F: '$1=="primaryUser"{print $2}' ` break fi done << < "$(echo -e " $lines ")" echo "primaryUser reference: ${PRI_USER}"
primaryUser reference: HOST_USER - 3
$ which perl / usr / bin / perl $ which python / usr / bin / python