Part 4: Mastering Text Processing in Linux with awk, sed, and grep

Introduction

In this part of the blog series, we'll delve into some powerful command-line tools in Linux: awk, sed, and grep. These tools are essential for text processing and data extraction tasks, making them invaluable for system administrators and developers alike.

awk Command

awk is a versatile command-line tool used for pattern scanning and processing. It interprets data as fields and records, which can be manipulated using various operations.

Basic Examples

Print the entire file:
```
 awk '{print}' app.log
```
Print the first and second fields:
```
 awk '{print $1,$2}' app.log
```
Print the first, second, and fourth fields:
```
 awk '{print $1,$2,$4}' app.log
```

Filter and print specific lines:

 awk '/mailbox_register/ {print $1,$2,$4}' app.log

Advanced Examples

Count occurrences of a pattern:

 awk '/mailbox_register/ {count++} END {print count}' app.log

Output: 13

Display a custom message with the count:

 awk '/mailbox_register/ {count++} END {print "The Count of mailbox_register is: " count}' app.log

Output: The Count of mailbox_register is: 13

Filter records by time range:

 awk '$2 >= "08:51:00" && $2 <="08:51:04" {print $2,$3,$4}' app.log

Print specific line numbers:

 awk 'NR >=2 && NR <10 {print NR, $2}' app.log

sed Command

sed is a stream editor for filtering and transforming text.

Basic Examples

Print lines matching a pattern:
```
 sed -n '/mailslot_create/p' app.log
```

Replace text in the file:

 sed 's/mailslot_create/CREATE/g' app.log

Print line numbers matching a pattern:

 sed -n -e '/mailbox_register/=' app.log

Combine multiple operations:

 sed -n -e '/mailbox_register/=' -e '/INFO/p' app.log

Replace text within a range of lines:
```
 sed '1,10 s/INFO/LOG/g' app.log
```

Replace text and print lines within a range:

 sed '1,10 s/INFO/LOG/g; 1,10p;11q' app.log

grep Command

grep is used for searching plain-text data for lines that match a regular expression.

Basic Examples

Search for a pattern:
```
 grep INFO app.log
```
Case-insensitive search:
```
 grep -i info app.log
```
Count occurrences of a pattern:
```
 grep -i -c info app.log
```

Count occurrences using awk:

 awk '/INFO/ {count++} END {print count}' app.log

Combining Commands

Combining ps, grep, and awk can be particularly powerful for process management:

List all processes:
```
 ps aux
```
Filter processes by name:
```
 ps aux | grep ubuntu
```

Extract specific fields:

 ps aux | grep ubuntu | awk '{print $2}'

Conclusion

Understanding and mastering these commands can significantly enhance your ability to handle and manipulate text files and system processes in Linux. By practicing these commands and exploring their options, you'll gain greater proficiency and efficiency in your daily tasks.

Practical Task

Create a directory, generate a log file, and apply the above commands to practice and solidify your understanding.

mkdir logs
echo -e "08:51:01 INFO :main: Starting process\n08:51:02 INFO :process: Running\n08:51:03 WARN :main: Low memory" > logs/app.log

Experiment with the awk, sed, and grep commands on your generated app.log to see the results firsthand.

Part 4: Mastering Text Processing in Linux with awk, sed, and grep

Table of contents

Introduction

awk Command

Basic Examples

Advanced Examples

sed Command

Basic Examples

grep Command

Basic Examples

Combining Commands

Conclusion

Practical Task