After stacks and queues taught us controlled access (LIFO / FIFO), hash tables introduce a new idea:
direct access by meaning, not position.
Instead of asking _“where is this element?”_, we ask:
_“what is the key?”_
A hash table maps keys → values using a hash function, aiming for O(1) access time.
Why Hash Tables Exist
Arrays are fast but require indexes.
Linked lists are flexible but slow to search.
Stacks and queues restrict access order.
Hash tables solve a different problem:
- Fast lookup
- Fast insert
- Fast delete
- No traversal required
This makes them ideal for:
- Dictionaries
- Caches
- Sets
- Indexes
- Lookup tables
Core Idea
A hash table is built from three parts:
- Key → what you search with
- Hash function → converts key to index
- Bucket array → stores the data
key ──hash()──▶ index ──▶ value
Basic Structure
Index: 0 1 2 3 4
┌───┬───┬───┬───┬───┐
Table: │ │ │ │ │ │
└───┴───┴───┴───┴───┘
Example hash:
hash(key) = key % 5
Insert keys: 7, 12, 17
7 % 5 = 2
12 % 5 = 2
17 % 5 = 2
Collision detected — welcome to real life.
Collision Handling
Collisions are unavoidable.
What matters is how you handle them.
Separate Chaining
Each bucket holds a linked list.
Index 2
|
v
[7] -> [12] -> [17]
Full table view:
0: []
1: []
2: [7] -> [12] -> [17]
3: []
4: []
Properties
- Simple to implement
- Uses extra memory
- Performance depends on chain length
Complexity
- Average: O(1)
- Worst case: O(n) (all keys collide)
Open Addressing
All values live inside the array.
Linear Probing
If the slot is occupied, move right.
Insert 7 → index 2
Insert 12 → index 2 (occupied) → index 3
Insert 17 → index 2 → 3 → 4
Index: 0 1 2 3 4
[] [] 7 12 17
Problems
- Clustering
- Performance degrades fast at high load
Load Factor
load_factor = elements / table_size
Example:
5 elements / 10 slots = 0.5
Rules of thumb:
- > 0.7 → resize
- < 0.3 → wasted memory
Resizing means:
- Create a bigger table
- Re-hash everything
Time Complexity Summary
| Operation | Average | Worst |
|---|---|---|
| Insert | O(1) | O(n) |
| Search | O(1) | O(n) |
| Delete | O(1) | O(n) |
Worst case happens when:
- Bad hash function
- Too many collisions
- Table never resized
Examples
Phone Book
name → phone number
Compiler Symbol Table
variable_name → memory location
Cache
request → response
Why Hash Tables Are NOT Cache-Friendly
- Memory is scattered
- Pointer chasing (chaining)
- Random access patterns
This is why:
- Arrays beat hash tables in tight loops
- CPUs love contiguous memory
Hash tables give us speed without order.
But what if we want:
- Sorted keys
- Ordered traversal
- Logarithmic guarantees
That naturally leads us to Trees (BST, AVL, Red-Black)
where structure replaces randomness.