Working with Regular Expression on Linux

by OpenLib . · November 21, 2023

Regular expressions (regex or regexp) are powerful patterns used for matching and manipulating text strings. In the context of the shell, regular expressions are often used with commands like grep, sed, awk, and others for text processing and pattern matching. Here’s a detailed explanation of regular expressions in the shell:

Basics of Regular Expressions:

1. Literal Characters:

Ordinary characters (e.g., letters, digits) match themselves.
Example: The regex abc matches the string “abc” exactly.

2. Metacharacters:

Special characters with a reserved meaning. Some common metacharacters include . (dot), * (asterisk), + (plus), ? (question mark), | (pipe), () (parentheses), [] (square brackets), {} (curly braces), and \ (backslash).

Character Classes:

1. Dot (.):

Matches any single character except a newline.

Bash

grep "a.b" filename

Matches “axb“, “aab“, “a@b“, etc.

2. Character Sets ([]):

Matches any one of the characters inside the brackets.

Bash

grep "[aeiou]" filename

Matches any line containing a vowel.

3. Negation (^ inside []):

Matches any character NOT listed.

Bash

grep "[^0-9]" filename

Matches any line that does not contain a digit.

Quantifiers:

1. Asterisk (*):

Matches zero or more occurrences of the preceding character or group.

Bash

grep "a*b" filename

Matches “b“, “ab“, “aab“, “aaab“, etc.

2. Plus (+):

Matches one or more occurrences of the preceding character or group.

Bash

grep "a+b" filename

Matches “ab“, “aab“, “aaab“, etc., but not “b“.

3. Question Mark (?):

Matches zero or one occurrence of the preceding character or group.

Bash

grep "ab?c" filename

Matches “abc” and “ac“.

4. Braces ({}):

Specifies a specific number of occurrences.

Bash

grep "a{2}" filename

Matches “aa” but not “a“.

Anchors:

1. Caret (^):

Anchors the pattern to the beginning of the line.

Bash

grep "^start" filename

Matches lines that start with “start”.

2. Dollar ($):

Anchors the pattern to the end of the line.

Bash

grep "end$" filename

Matches lines that end with “end“.

Escape Character (`\`):

1. Backslash (\):

Escapes a metacharacter, treating it as a literal character.

Bash

grep "a\.b" filename

Matches “a.b“.

Grouping (`()`):

1. Parentheses (()) for Grouping:

Groups characters together to apply a quantifier to the entire group.

Bash

grep "\(abc\)\{2\}" filename

Matches “abcabc“.

Examples:

1. Matching IP Addresses:

Bash

grep "\([0-9]\{1,3\}\.\)\{3\}[0-9]\{1,3\}" filename

Matches IPv4 addresses.

2. Extracting Email Addresses:

Bash

grep -E "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b" filename

Matches email addresses.

3. Matching Numbers in a Range:

Bash

grep "[1-9][0-9]\{0,2\}" filename

Matches numbers from 1 to 999.

Using Regular Expressions in Commands:

1. grep Command:

Bash

grep "pattern" filename

2. sed Command:

Bash

sed 's/pattern/replacement/' filename

3. awk Command:

Bash

awk '/pattern/ {print $0}' filename

Regular expressions are a fundamental tool for text processing in the shell. While the basics covered here are common across many tools, there are some variations and extensions depending on the specific command or programming language being used. Practice and experimentation will help you become more comfortable and proficient with regular expressions.

Working with Regular Expression on Linux

Basics of Regular Expressions:

Character Classes:

Quantifiers:

Anchors:

Escape Character (`\`):

Grouping (`()`):

Examples:

Using Regular Expressions in Commands:

You may also like...

What’s Hot?

Categories

Recent Posts

Recent Topics

Working with Regular Expression on Linux

Basics of Regular Expressions:

Character Classes:

Quantifiers:

Anchors:

Escape Character (\):

Grouping (()):

Examples:

Using Regular Expressions in Commands:

You may also like...

Understanding Special Characters in Bash

Introduction to Linux Shell

Problems with GNU – GPL (General Public License)

What’s Hot?

Categories

Escape Character (`\`):

Grouping (`()`):