OS: Linux Tools | Amr Tarek

Searching in Linux (grep, find, locate)

Search Inside Files

grep looks for text patterns inside files and outputs matching lines. It’s one of the most used commands in Linux.

Search for a word in a file:

grep "error" logfile.txt

Search case-insensitively:

grep -i "error" logfile.txt

Search recursively in a directory:

grep -r "TODO" ~/projects

Show line numbers:

grep -n "main" program.c

Exclude matching lines (-v):

grep -v "debug" logfile.txt

→ Displays all lines that do not contain the word _debug_.

grep is excellent for searching logs, source code, or configuration files.

`find` – Search for Files by Attributes

find locates files and directories based on name, type, size, or modification time. Unlike locate, it searches the filesystem in real time.

Find a file by name:

find /home -name "notes.txt"

Find all .log files modified in the last 7 days:

find /var/log -name "*.log" -mtime -7

Find large files (>100MB):

find / -size +100M

find is extremely flexible, though slower than locate.

`locate` – Instant File Search (Database-Based)

locate searches a pre-built database of files, making it much faster than find.

Simple search:

locate passwd

Update the database (required if new files don’t show up):

sudo updatedb

Great for quick searches, but may not show files created after the last database update.

Text Processing (less, head, tail, wc, cut)

Linux provides lightweight but powerful tools to read, preview, and manipulate text directly from the command line.

`less` – View Files Page by Page

Lets you scroll through large files without loading the whole file into memory.
Usage:

less logfile.txt

Controls:
Arrow keys → move up/down
/pattern → search for text
q → quit

`head` – Show the Beginning of a File

Displays the first 10 lines by default.
Example:

head logfile.txt

Show first 20 lines:

head -n 20 logfile.txt

`tail` – Show the End of a File

Displays the last 10 lines by default.
Example:

tail logfile.txt

Show last 50 lines:

tail -n 50 logfile.txt

Follow a file in real-time (useful for logs):

tail -f /var/log/syslog

`wc` – Word, Line, and Character Count

Counts lines, words, and characters.
Example:

wc notes.txt

Output format: lines words characters filename

Just count lines:

wc -l notes.txt

`cut` – Extract Columns from Text

Useful for splitting text into fields (like CSV or log files).
Extract first 10 characters of each line:

cut -c 1-10 file.txt

Extract by delimiter (e.g., CSV with commas):

cut -d',' -f2 data.csv

→ Shows the 2nd column.

Text Processing Essentials: `sort`, `uniq`, `sed`, `awk`

Linux shines at working with text. These four tools are the backbone of log parsing, data cleanup, quick reports, and one-liners. They’re powerful alone—and even better together in pipelines.

`sort` — Order lines

sort arranges input lines lexicographically by default. It can also sort numerically, by columns (fields), reverse, and more.

Common options

-n → numeric sort (e.g., 2 < 10 becomes 2,10 with lexicographic; -n fixes it)
-r → reverse order
-u → unique (suppress duplicates _after_ sorting)
-k M[,N] → sort by key/field range (uses whitespace by default)
-t 'SEP' → field separator (e.g., -t, for CSV)
-h → human numbers (e.g., 1K 2M 900), handy for du -h
-V → “version” sort (e.g., v1.9 < v1.10)

Examples

# Sort lines alphabetically
sort file.txt

# Numeric sort descending (e.g., top 10)
sort -nr scores.txt | head

# Sort CSV by 3rd column numerically
sort -t, -k3,3n data.csv

# Human-readable sizes (e.g., from du -h)
du -h /var/log | sort -h

Gotcha: sort -u removes duplicates _only after sorting_; to deduplicate unsorted data while preserving the first occurrence, use awk '!seen[$0]++'.

`uniq` — Collapse adjacent duplicates

uniq filters out adjacent duplicate lines. It pairs naturally with sort (which groups duplicates).

Common options

_(no flag)_ → remove adjacent duplicates
-c → prefix counts
-d → only duplicates
-u → only unique (non-repeated) lines
-i → case-insensitive

Examples

# Count occurrences of each line (case-sensitive)
sort access.log | uniq -c | sort -nr

# Show only lines that appear exactly once
sort items.txt | uniq -u

# Case-insensitive unique
sort names.txt | uniq -ci

Gotcha: Without sort, duplicates that are not adjacent remain.

`sed` — Stream editor (substitute, delete, insert)

sed edits text streams non-interactively. The most common task is substitution; it also deletes lines, prints ranges, and performs simple transforms.

Common patterns

s/OLD/NEW/ → substitute first match per line
s/OLD/NEW/g → substitute all matches per line
-i → edit file in place (use with care; consider -i.bak)
Addressing: N (line number), /regex/, ranges like 1,10, /start/,/end/

Examples

# Replace first occurrence of foo with bar per line
sed 's/foo/bar/' file.txt

# Replace all occurrences
sed 's/foo/bar/g' file.txt

# In-place rename .txt to .md inside links (make backup)
sed -i.bak 's/\.txt)/.md)/g' README.md

# Delete blank lines
sed '/^$/d' notes.txt

# Print lines 10..20
sed -n '10,20p' file.txt

# Change only in lines matching a pattern
sed '/ERROR/s/timeout/Timed Out/g' app.log

Gotchas

Delimiter can be changed (useful with slashes): sed 's|/var/log|/logs|g'
macOS sed -i requires a backup suffix (e.g., -i '' for none).

`awk` — Pattern scanning & field processing

awk reads line by line, splits into fields (default: whitespace), and runs actions on matches. It’s great for column reports, filtering, and small computations.

Core syntax

awk 'pattern { action }' file

Special blocks: BEGIN { … } (before input), END { … } (after input)

Built-ins

Variables: $1, $2, … (fields), $0 (whole line), NF (num fields), NR (record/line number)
-F 'SEP' → custom field separator (CSV, TSV, etc.)
printf for controlled formatting

Examples

# Print 1st and 3rd fields
awk '{print $1, $3}' data.txt

# Sum the 2nd column (numeric)
awk '{sum += $2} END {print sum}' numbers.txt

# CSV: 2nd and 5th fields
awk -F, '{print $2, $5}' data.csv

# Filter rows where 3rd field > 100 and print id + value
awk '$3 > 100 {print $1, $3}' table.txt

# Pretty table with header
awk 'BEGIN {printf "%-10s %-10s\n","Name","Score"} {printf "%-10s %-10s\n",$1,$2}' scores.txt

Gotchas

For true CSV (quotes/commas inside quotes), prefer csvtool, xsv, mlr, or Python.
Use -v var=value to pass shell vars: awk -v th=100 '$3 > th {print $1,$3}' file.

Powerful Pipelines (combine them)

Top N frequent items

sort items.txt | uniq -c | sort -nr | head

Unique lines, original order preserved

awk '!seen[$0]++' file.txt

Extract ERROR timestamps (first 2 fields), count per minute

awk '/ERROR/ {print $1, $2}' app.log | sort | uniq -c | sort -nr

CSV: average of column 3

awk -F, '{sum+=$3; n++} END {if  print sum/n}' data.csv

Normalize whitespace, lowercase, then count

tr -s '[:space:]' ' ' < text.txt | tr '[:upper:]' '[:lower:]' | sort | uniq -c | sort -nr

Batch rename in text with backup

sed -i.bak 's/\.jpeg/\.jpg/g' gallery.md

sort
Alphabetic: sort
Numeric: sort -n
By field: sort -t, -k2,2n
Human sizes: sort -h

uniq
Count: uniq -c
Only dups: uniq -d
Only uniques: uniq -u

sed
Replace all: sed 's/old/new/g'
Delete lines: sed '/regex/d'
In place: sed -i.bak 's/a/b/' file

awk
Fields: awk '{print $1,$3}'
Filter + sum: awk '$2>100 {s+=$2} END{print s}'
CSV: awk -F, '{print $2}'

Pipelines & Redirection (>, >>, |, 2>)

In Linux, the shell provides ways to redirect input/output and chain commands together. This is what makes the command line so powerful.

Output Redirection

> → Redirect output to a file (overwrite).

ls > files.txt

→ Saves the output of ls into files.txt, replacing existing content.

>> → Append output to a file.

echo "New entry" >> log.txt

→ Adds text at the end of log.txt without deleting old content.

Input Redirection

< → Take input from a file instead of keyboard.

sort < names.txt

→ Sorts the contents of names.txt.

Pipelines

| → Send the output of one command into another command.

ls -l | grep ".txt"

→ Lists files and filters only .txt files.

Example: Count the number of lines containing "error" in a log:

grep "error" logfile.txt | wc -l

Error Redirection

2> → Redirect errors to a file.

ls /root 2> errors.txt

→ Saves permission-denied errors into errors.txt.

2>&1 → Redirect errors to the same place as normal output.

command > output.txt 2>&1