Sed and Awk in Bash

Bash, the powerful command-line interface for Linux and Unix systems, goes beyond basic commands with tools like sed (stream editor) and awk. These utilities offer sophisticated text processing capabilities, transforming your scripts from simple automation to data manipulation powerhouses.

Sed: Editing Text Streams

Sed operates on text streams, modifying them based on commands you provide. Think of it as a command-line text editor that works line by line.

Key Sed Commands

  • Substitute (s): The most common command, it replaces text patterns with new values.
  • Delete (d): Removes lines matching a pattern.
  • Print (p): Outputs specific lines (often used with -n to suppress default output).
  • Append (a): Adds text after a line matching a pattern.
  • Insert (i): Adds text before a line matching a pattern.

Example: Substitute and Print

echo "hello world" | sed 's/hello/hi/'   # Output: hi world
sed -n '/pattern/p' file.txt            # Print lines containing 'pattern'

Awk: Programming with Text

Awk is a full-fledged scripting language designed for text processing. It can perform calculations, manipulate data structures, and generate reports.

Awk Fundamentals

  • Fields ($1, $2, …): Awk automatically divides input into fields (words) based on separators (usually whitespace).
  • Records ($0): The entire line of input is considered a record.
  • Patterns and Actions: Awk scripts consist of patterns to match and actions to take when a match occurs.

Example: Field Manipulation

awk '{print $1, $3}' file.txt    # Print the first and third field of each line

Example: Calculations

awk '{sum += $2} END {print sum}' numbers.txt  # Sum the second field of each line

Combining Sed and Awk for Advanced Scripting

ps -ef | awk '{print $2}' | sed 's/^/PID: /'  

This pipeline demonstrates the combined power of awk and sed:

  1. ps -ef: Lists running processes.
  2. awk '{print $2}': Extracts the second field (process ID).
  3. sed 's/^/PID: /': Prepends “PID: ” to each line.

Real-World Applications

  • Log Analysis: Extract relevant information from log files, summarize data, and create reports.
  • Data Cleanup: Modify and reformat data in CSV files or other formats.
  • Text Generation: Create customized reports and output formats.

Example: In-Place File Editing (Sed)

sed -i 's/old_text/new_text/g' file.txt 

Explanation:

This command modifies the file file.txt directly, replacing all instances of old_text with new_text using the -i flag. The ‘g’ ensures global replacement across the whole file.

Example: Filtering and Transformation (Awk)

awk '/error/ {print $5, $6}' logfile.txt

Explanation:

This command filters a log file for lines containing the word “error” and extracts the fifth and sixth fields from those lines, providing a targeted view of error-related information.