DSA: Hash Table

A hash table maps keys to values using a hash function to achieve near constant-time access.

After stacks and queues taught us controlled access (LIFO / FIFO), hash tables introduce a new idea:

direct access by meaning, not position.

Instead of asking _“where is this element?”_, we ask:

_“what is the key?”_

A hash table maps keys → values using a hash function, aiming for O(1) access time.


Why Hash Tables Exist

Arrays are fast but require indexes.

Linked lists are flexible but slow to search.

Stacks and queues restrict access order.

Hash tables solve a different problem:

  • Fast lookup
  • Fast insert
  • Fast delete
  • No traversal required

This makes them ideal for:

  • Dictionaries
  • Caches
  • Sets
  • Indexes
  • Lookup tables

Core Idea

A hash table is built from three parts:

  • Key → what you search with
  • Hash function → converts key to index
  • Bucket array → stores the data

md
key ──hash()──▶ index ──▶ value


Basic Structure

md
Index:   0    1    2    3    4
        ┌───┬───┬───┬───┬───┐
Table:  │   │   │   │   │   │
        └───┴───┴───┴───┴───┘

Example hash:

md
hash(key) = key % 5

Insert keys: 7, 12, 17

md
7  % 5 = 2
12 % 5 = 2
17 % 5 = 2

Collision detected — welcome to real life.

Collision Handling

Collisions are unavoidable.

What matters is how you handle them.


Separate Chaining

Each bucket holds a linked list.

md
Index 2
  |
  v
[7] -> [12] -> [17]

Full table view:

md
0: []
1: []
2: [7] -> [12] -> [17]
3: []
4: []

Properties

  • Simple to implement
  • Uses extra memory
  • Performance depends on chain length

Complexity

  • Average: O(1)
  • Worst case: O(n) (all keys collide)

Open Addressing

All values live inside the array.

Linear Probing

If the slot is occupied, move right.

md
Insert 7 → index 2
Insert 12 → index 2 (occupied) → index 3
Insert 17 → index 2 → 3 → 4

md
Index: 0   1   2   3   4
       []  []  7  12  17

Problems

  • Clustering
  • Performance degrades fast at high load

Load Factor

md
load_factor = elements / table_size

Example:

md
5 elements / 10 slots = 0.5

Rules of thumb:

  • > 0.7 → resize
  • < 0.3 → wasted memory

Resizing means:

  • Create a bigger table
  • Re-hash everything

Time Complexity Summary

OperationAverageWorst
InsertO(1)O(n)
SearchO(1)O(n)
DeleteO(1)O(n)

Worst case happens when:

  • Bad hash function
  • Too many collisions
  • Table never resized

Examples

Phone Book

md
name → phone number

Compiler Symbol Table

md
variable_name → memory location

Cache

md
request → response


Why Hash Tables Are NOT Cache-Friendly

  • Memory is scattered
  • Pointer chasing (chaining)
  • Random access patterns

This is why:

  • Arrays beat hash tables in tight loops
  • CPUs love contiguous memory

Hash tables give us speed without order.

But what if we want:

  • Sorted keys
  • Ordered traversal
  • Logarithmic guarantees

That naturally leads us to Trees (BST, AVL, Red-Black)

where structure replaces randomness.