{"id":11622,"date":"2019-03-15T02:06:32","date_gmt":"2019-03-15T02:06:32","guid":{"rendered":"http:\/\/www.appservgrid.com\/paw92\/?p=11622"},"modified":"2019-03-15T02:06:32","modified_gmt":"2019-03-15T02:06:32","slug":"how-to-use-awk-and-regular-expressions-to-filter-text-or-string-in-files","status":"publish","type":"post","link":"https:\/\/www.appservgrid.com\/paw92\/index.php\/2019\/03\/15\/how-to-use-awk-and-regular-expressions-to-filter-text-or-string-in-files\/","title":{"rendered":"How to Use Awk and Regular Expressions to Filter Text or String in Files"},"content":{"rendered":"<p>When we run certain commands in Unix\/Linux to read or edit text from a string or file, we most times try to filter output to a given section of interest. This is where using regular expressions comes in handy.<\/p>\n<p><b>Read Also:<\/b>\u00a0<a href=\"https:\/\/www.tecmint.com\/chaining-operators-in-linux-with-practical-examples\/\" target=\"_blank\" rel=\"noopener\">10 Useful Linux Chaining Operators with Practical Examples<\/a><\/p>\n<h4>What are Regular Expressions?<\/h4>\n<p>A regular expression can be defined as a strings that represent several sequence of characters. One of the most important things about regular expressions is that they allow you to filter the output of a command or file, edit a section of a text or configuration file and so on.<\/p>\n<h4>Features of Regular Expression<\/h4>\n<p>Regular expressions are made of:<\/p>\n<ol>\n<li><strong>Ordinary characters<\/strong>\u00a0such as space, underscore(_), A-Z, a-z, 0-9.<\/li>\n<li><strong>Meta characters<\/strong>\u00a0that are expanded to ordinary characters, they include:\n<ol>\n<li><code>(.)<\/code>\u00a0it matches any single character except a newline.<\/li>\n<li><code>(*)<\/code>\u00a0it matches zero or more existences of the immediate character preceding it.<\/li>\n<li><code>[ character(s) ]<\/code>\u00a0it matches any one of the characters specified in character(s), one can also use a hyphen\u00a0<code>(-)<\/code>\u00a0to mean a range of characters such as\u00a0<code>[a-f]<\/code>,\u00a0<code>[1-5]<\/code>, and so on.<\/li>\n<li><code>^<\/code>\u00a0it matches the beginning of a line in a file.<\/li>\n<li><code>$<\/code>\u00a0matches the end of line in a file.<\/li>\n<li><code>\\<\/code>\u00a0it is an escape character.<\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<p>In order to filter text, one has to use a text filtering tool such as\u00a0<strong>awk<\/strong>. You can think of\u00a0<strong>awk<\/strong>\u00a0as a programming language of its own. But for the scope of this guide to using\u00a0<strong>awk<\/strong>, we shall cover it as a simple command line filtering tool.<\/p>\n<p><center>The general syntax of awk is:<\/center><\/p>\n<pre># awk 'script' filename\r\n<\/pre>\n<p>Where\u00a0<code>'script'<\/code>\u00a0is a set of commands that are understood by\u00a0<strong>awk<\/strong>\u00a0and are execute on file, filename.<\/p>\n<p>It works by reading a given line in the file, makes a copy of the line and then executes the script on the line. This is repeated on all the lines in the file.<\/p>\n<p>The\u00a0<code>'script'<\/code>\u00a0is in the form\u00a0<code>'\/pattern\/ action'<\/code>\u00a0where\u00a0<strong>pattern<\/strong>\u00a0is a regular expression and the\u00a0<strong>action<\/strong>\u00a0is what awk will do when it finds the given pattern in a line.<\/p>\n<h3>How to Use Awk Filtering Tool in Linux<\/h3>\n<p>In the following examples, we shall focus on the meta characters that we discussed above under the features of awk.<\/p>\n<h4>A simple example of using awk:<\/h4>\n<p>The example below prints all the lines in the file\u00a0<strong>\/etc\/hosts<\/strong>\u00a0since no pattern is given.<\/p>\n<pre># awk '<strong>\/\/<\/strong>{print}'\/etc\/hosts\r\n<\/pre>\n<div id=\"attachment_19810\" class=\"wp-caption aligncenter\">\n<p><a href=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Awk-Command-Example.gif\" rel=\"attachment wp-att-19810\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-19810\" src=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Awk-Command-Example.gif\" alt=\"Awk Prints all Lines in a File \" width=\"670\" height=\"314\" data-lazy-loaded=\"true\" \/><\/a><\/p>\n<p class=\"wp-caption-text\">Awk Prints all Lines in a File<\/p>\n<\/div>\n<h4>Use Awk with Pattern:<\/h4>\n<p>I the example below, a pattern\u00a0<code>localhost<\/code>\u00a0has been given, so awk will match line having\u00a0<strong>localhost<\/strong>\u00a0in the\u00a0<code>\/etc\/hosts<\/code>\u00a0file.<\/p>\n<pre># awk '<strong>\/localhost\/<\/strong>{print}' \/etc\/hosts \r\n<\/pre>\n<div id=\"attachment_19811\" class=\"wp-caption aligncenter\">\n<p><a href=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-Command-with-Pattern.gif\" rel=\"attachment wp-att-19811\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-19811\" src=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-Command-with-Pattern.gif\" alt=\"Awk Print Given Matching Line in a File\" width=\"673\" height=\"129\" data-lazy-loaded=\"true\" \/><\/a><\/p>\n<p class=\"wp-caption-text\">Awk Print Given Matching Line in a File<\/p>\n<\/div>\n<h4>Using Awk with (.) wild card in a Pattern<\/h4>\n<p>The\u00a0<code>(.)<\/code>\u00a0will match strings containing\u00a0<strong>loc<\/strong>,\u00a0<strong>localhost<\/strong>,\u00a0<strong>localnet<\/strong>\u00a0in the example below.<\/p>\n<p>That is to say\u00a0<strong>* l some_single_character c *<\/strong>.<\/p>\n<pre># awk '<strong>\/l.c\/<\/strong>{print}' \/etc\/hosts\r\n<\/pre>\n<div id=\"attachment_19812\" class=\"wp-caption aligncenter\">\n<p><a href=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-with-Wild-Cards.gif\" rel=\"attachment wp-att-19812\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-19812\" src=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-with-Wild-Cards.gif\" alt=\"Use Awk to Print Matching Strings in a File\" width=\"675\" height=\"155\" data-lazy-loaded=\"true\" \/><\/a><\/p>\n<p class=\"wp-caption-text\">Use Awk to Print Matching Strings in a File<\/p>\n<\/div>\n<h4>Using Awk with (*) Character in a Pattern<\/h4>\n<p>It will match strings containing\u00a0<strong>localhost<\/strong>,\u00a0<strong>localnet<\/strong>,\u00a0<strong>lines<\/strong>,\u00a0<strong>capable<\/strong>, as in the example below:<\/p>\n<pre># awk '<strong>\/l*c\/<\/strong>{print}' \/etc\/localhost\r\n<\/pre>\n<div id=\"attachment_19813\" class=\"wp-caption aligncenter\">\n<p><a href=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-to-Match-Strings-in-File.gif\" rel=\"attachment wp-att-19813\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-19813\" src=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-to-Match-Strings-in-File.gif\" alt=\"Use Awk to Match Strings in File\" width=\"725\" height=\"250\" data-lazy-loaded=\"true\" \/><\/a><\/p>\n<p class=\"wp-caption-text\">Use Awk to Match Strings in File<\/p>\n<\/div>\n<p>You will also realize that\u00a0<code>(*)<\/code>\u00a0tries to a get you the longest match possible it can detect.<\/p>\n<p>Let look at a case that demonstrates this, take the regular expression\u00a0<code>t*t<\/code>\u00a0which means match strings that start with letter\u00a0<code>t<\/code>\u00a0and end with\u00a0<code>t<\/code>\u00a0in the line below:<\/p>\n<pre>this is tecmint, where you get the best good tutorials, how to's, guides, tecmint. \r\n<\/pre>\n<p>You will get the following possibilities when you use the pattern\u00a0<code>\/t*t\/<\/code>:<\/p>\n<pre>this is t\r\nthis is tecmint\r\nthis is tecmint, where you get t\r\nthis is tecmint, where you get the best good t\r\nthis is tecmint, where you get the best good tutorials, how t\r\nthis is tecmint, where you get the best good tutorials, how tos, guides, t\r\nthis is tecmint, where you get the best good tutorials, how tos, guides, tecmint\r\n<\/pre>\n<p>And\u00a0<code>(*)<\/code>\u00a0in\u00a0<code>\/t*t\/<\/code>\u00a0wild card character allows awk to choose the the last option:<\/p>\n<pre>this is tecmint, where you get the best good tutorials, how to's, guides, tecmint\r\n<\/pre>\n<h4>Using Awk with set [ character(s) ]<\/h4>\n<p>Take for example the set\u00a0<code>[al1]<\/code>, here awk will match all strings containing character\u00a0<code>a<\/code>\u00a0or\u00a0<code>l<\/code>\u00a0or\u00a0<code>1<\/code>\u00a0in a line in the file\u00a0<strong>\/etc\/hosts<\/strong>.<\/p>\n<pre># awk '<strong>\/[al1]\/<\/strong>{print}' \/etc\/hosts\r\n<\/pre>\n<div id=\"attachment_19814\" class=\"wp-caption aligncenter\">\n<p><a href=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-to-Print-Matching-Character.gif\" rel=\"attachment wp-att-19814\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-19814\" src=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-to-Print-Matching-Character.gif\" alt=\"Use-Awk to Print Matching Character in File\" width=\"674\" height=\"288\" data-lazy-loaded=\"true\" \/><\/a><\/p>\n<p class=\"wp-caption-text\">Use-Awk to Print Matching Character in File<\/p>\n<\/div>\n<p>The next example matches strings starting with either\u00a0<code>K<\/code>\u00a0or\u00a0<code>k<\/code>\u00a0followed by\u00a0<code>T<\/code>:<\/p>\n<pre># awk '<strong>\/[Kk]T\/<\/strong>{print}' \/etc\/hosts \r\n<\/pre>\n<div id=\"attachment_19815\" class=\"wp-caption aligncenter\">\n<p><a href=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-to-Print-Matched-String-in-File.gif\" rel=\"attachment wp-att-19815\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-19815\" src=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-to-Print-Matched-String-in-File.gif\" alt=\"Use Awk to Print Matched String in File\" width=\"592\" height=\"73\" data-lazy-loaded=\"true\" \/><\/a><\/p>\n<p class=\"wp-caption-text\">Use Awk to Print Matched String in File<\/p>\n<\/div>\n<h4>Specifying Characters in a Range<\/h4>\n<p>Understand characters with awk:<\/p>\n<ol>\n<li><code>[0-9]<\/code>\u00a0means a single number<\/li>\n<li><code>[a-z]<\/code>\u00a0means match a single lower case letter<\/li>\n<li><code>[A-Z]<\/code>\u00a0means match a single upper case letter<\/li>\n<li><code>[a-zA-Z]<\/code>\u00a0means match a single letter<\/li>\n<li><code>[a-zA-Z 0-9]<\/code>\u00a0means match a single letter or number<\/li>\n<\/ol>\n<p>Lets look at an example below:<\/p>\n<pre># awk '<strong>\/[0-9]\/<\/strong>{print}' \/etc\/hosts \r\n<\/pre>\n<div id=\"attachment_19816\" class=\"wp-caption aligncenter\">\n<p><a href=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-To-Print-Matching-Numbers-in-File.gif\" rel=\"attachment wp-att-19816\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-19816\" src=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-To-Print-Matching-Numbers-in-File.gif\" alt=\"Use Awk To Print Matching Numbers in File\" width=\"676\" height=\"310\" data-lazy-loaded=\"true\" \/><\/a><\/p>\n<p class=\"wp-caption-text\">Use Awk To Print Matching Numbers in File<\/p>\n<\/div>\n<p>All the line from the file\u00a0<strong>\/etc\/hosts<\/strong>\u00a0contain at least a single number\u00a0<code>[0-9]<\/code>\u00a0in the above example.<\/p>\n<h4>Use Awk with (^) Meta Character<\/h4>\n<p>It matches all the lines that start with the pattern provided as in the example below:<\/p>\n<pre># awk '<strong>\/^fe\/<\/strong>{print}' \/etc\/hosts\r\n# awk '<strong>\/^ff\/<\/strong>{print}' \/etc\/hosts\r\n<\/pre>\n<div id=\"attachment_19817\" class=\"wp-caption aligncenter\">\n<p><a href=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-to-Print-All-Matching-Lines-with-Pattern.gif\" rel=\"attachment wp-att-19817\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-19817\" src=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-to-Print-All-Matching-Lines-with-Pattern.gif\" alt=\"Use Awk to Print All Matching Lines with Pattern\" width=\"573\" height=\"174\" data-lazy-loaded=\"true\" \/><\/a><\/p>\n<p class=\"wp-caption-text\">Use Awk to Print All Matching Lines with Pattern<\/p>\n<\/div>\n<h4>Use Awk with ($) Meta Character<\/h4>\n<p>It matches all the lines that end with the pattern provided:<\/p>\n<pre># awk '<strong>\/ab$\/<\/strong>{print}' \/etc\/hosts\r\n# awk '<strong>\/ost$\/<\/strong>{print}' \/etc\/hosts\r\n# awk '<strong>\/rs$\/<\/strong>{print}' \/etc\/hosts\r\n<\/pre>\n<div id=\"attachment_19818\" class=\"wp-caption aligncenter\">\n<p><a href=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-to-Print-Given-Pattern-String.gif\" rel=\"attachment wp-att-19818\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-19818\" src=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-to-Print-Given-Pattern-String.gif\" alt=\"Use Awk to Print Given Pattern String\" width=\"586\" height=\"185\" data-lazy-loaded=\"true\" \/><\/a><\/p>\n<p class=\"wp-caption-text\">Use Awk to Print Given Pattern String<\/p>\n<\/div>\n<h4>Use Awk with (\\) Escape Character<\/h4>\n<p>It allows you to take the character following it as a literal that is to say consider it just as it is.<\/p>\n<p>In the example below, the first command prints out all line in the file, the second command prints out nothing because I want to match a line that has\u00a0<strong>$25.00<\/strong>, but no escape character is used.<\/p>\n<p>The third command is correct since a an escape character has been used to read\u00a0<strong>$<\/strong>\u00a0as it is.<\/p>\n<pre># awk '<strong>\/\/<\/strong>{print}' deals.txt\r\n# awk '<strong>\/$25.00\/<\/strong>{print}' deals.txt\r\n# awk '<strong>\/\\.00\/<\/strong>{print}' deals.txt\r\n<\/pre>\n<div id=\"attachment_19819\" class=\"wp-caption aligncenter\">\n<p><a href=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-with-Escape-Character.gif\" rel=\"attachment wp-att-19819\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-19819\" src=\"https:\/\/www.tecmint.com\/wp-content\/uploads\/2016\/04\/Use-Awk-with-Escape-Character.gif\" alt=\"Use Awk with Escape Character\" width=\"619\" height=\"211\" data-lazy-loaded=\"true\" \/><\/a><\/p>\n<p class=\"wp-caption-text\">Use Awk with Escape Character<\/p>\n<\/div>\n<h3>Summary<\/h3>\n<p>That is not all with the\u00a0<strong>awk<\/strong>\u00a0command line filtering tool, the examples above a the basic operations of awk. In the next parts we shall be advancing on how to use complex features of awk. Thanks for reading through and for any additions or clarifications, post a comment in the comments section.<\/p>\n<p><a href=\"https:\/\/www.tecmint.com\/use-linux-awk-command-to-filter-text-string-in-files\/\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>When we run certain commands in Unix\/Linux to read or edit text from a string or file, we most times try to filter output to a given section of interest. This is where using regular expressions comes in handy. Read Also:\u00a010 Useful Linux Chaining Operators with Practical Examples What are Regular Expressions? A regular expression &hellip; <\/p>\n<p class=\"link-more\"><a href=\"https:\/\/www.appservgrid.com\/paw92\/index.php\/2019\/03\/15\/how-to-use-awk-and-regular-expressions-to-filter-text-or-string-in-files\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;How to Use Awk and Regular Expressions to Filter Text or String in Files&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-11622","post","type-post","status-publish","format-standard","hentry","category-linux"],"_links":{"self":[{"href":"https:\/\/www.appservgrid.com\/paw92\/index.php\/wp-json\/wp\/v2\/posts\/11622","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.appservgrid.com\/paw92\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.appservgrid.com\/paw92\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw92\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw92\/index.php\/wp-json\/wp\/v2\/comments?post=11622"}],"version-history":[{"count":1,"href":"https:\/\/www.appservgrid.com\/paw92\/index.php\/wp-json\/wp\/v2\/posts\/11622\/revisions"}],"predecessor-version":[{"id":11625,"href":"https:\/\/www.appservgrid.com\/paw92\/index.php\/wp-json\/wp\/v2\/posts\/11622\/revisions\/11625"}],"wp:attachment":[{"href":"https:\/\/www.appservgrid.com\/paw92\/index.php\/wp-json\/wp\/v2\/media?parent=11622"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw92\/index.php\/wp-json\/wp\/v2\/categories?post=11622"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.appservgrid.com\/paw92\/index.php\/wp-json\/wp\/v2\/tags?post=11622"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}