Heaps
A heap is a tree-based data structure satisfying the heap property: each parent has higher priority than its children. Heaps efficiently support priority queue operations.
Binary Heap
Structure
A complete binary tree stored in an array. For node at index i (0-indexed):
- Parent: (i - 1) / 2
- Left child: 2i + 1
- Right child: 2i + 2
Array: [90, 80, 70, 50, 60, 30, 20, 10, 40, 55]
Tree:
90
/ \
80 70
/ \ / \
50 60 30 20
/ \ /
10 40 55
Max-Heap vs Min-Heap
Max-heap: Parent ≥ children. Root is the maximum. Min-heap: Parent ≤ children. Root is the minimum.
Operations
| Operation | Time | |---|---| | peek (find max/min) | O(1) | | insert | O(log n) | | extract max/min | O(log n) | | heapify (build heap) | O(n) | | heap sort | O(n log n) |
Insert (Sift Up)
- Add element at the end (next available position in the array).
- Sift up: Compare with parent. If violating heap property, swap and repeat.
PROCEDURE SIFT_UP(data, i)
WHILE i > 0 DO
parent ← (i - 1) / 2
IF data[i] > data[parent] THEN
SWAP data[i] AND data[parent]
i ← parent
ELSE
BREAK
Extract Max/Min (Sift Down)
- Replace root with the last element. Remove last.
- Sift down: Compare root with children. Swap with the larger (max-heap) child. Repeat.
PROCEDURE SIFT_DOWN(data, i, n)
LOOP
largest ← i
left ← 2 * i + 1
right ← 2 * i + 2
IF left < n AND data[left] > data[largest] THEN largest ← left
IF right < n AND data[right] > data[largest] THEN largest ← right
IF largest = i THEN BREAK
SWAP data[i] AND data[largest]
i ← largest
Build Heap (Heapify) — O(n)
Given an unordered array, build a heap in O(n) by sifting down from the last internal node to the root.
PROCEDURE BUILD_HEAP(data)
n ← LENGTH(data)
FOR i ← (n / 2 - 1) DOWNTO 0 DO
SIFT_DOWN(data, i, n)
Why O(n)? Most nodes are near the bottom and sift down a short distance. Formally: Σᵢ O(height(i)) = O(n) (sum of heights is linear).
Heap Sort
- Build a max-heap: O(n).
- Repeatedly extract the max and place it at the end: O(n log n).
PROCEDURE HEAP_SORT(arr)
n ← LENGTH(arr)
// Build max-heap
FOR i ← (n / 2 - 1) DOWNTO 0 DO
SIFT_DOWN(arr, i, n)
// Extract elements
FOR i ← (n - 1) DOWNTO 1 DO
SWAP arr[0] AND arr[i]
SIFT_DOWN(arr, 0, i)
Properties: In-place. Not stable. O(n log n) worst case (better than quicksort's worst case).
d-ary Heaps
Each node has d children instead of 2.
Height: log_d(n). Lower height → fewer comparisons for sift-up. Sift down: Must compare with d children → more comparisons per level.
Optimal d: Depends on the operation mix. For decrease-key-heavy workloads (Dijkstra), d = 4 is often optimal in practice.
Binomial Heaps
A collection of binomial trees of distinct orders.
Binomial tree B_k: Has 2ᵏ nodes. Root has k children (B₀, B₁, ..., B_{k-1}).
Operations
| Operation | Time | |---|---| | insert | O(log n) amortized, O(1) amortized with lazy insert | | find min | O(log n) or O(1) with min pointer | | extract min | O(log n) | | merge | O(log n) | | decrease key | O(log n) | | delete | O(log n) |
Key advantage: O(log n) merge (vs O(n) for binary heap).
Fibonacci Heaps
The theoretically optimal heap for many algorithms.
Amortized Complexity
| Operation | Amortized Time | |---|---| | insert | O(1) | | find min | O(1) | | merge | O(1) | | extract min | O(log n) | | decrease key | O(1) | | delete | O(log n) |
Decrease-key in O(1) is the killer feature — it makes Dijkstra's algorithm O(E + V log V) instead of O(E log V).
Structure
A collection of heap-ordered trees (not necessarily binomial). Trees are linked in a circular doubly linked list.
Lazy operations: Insert and merge just add trees to the root list. Cleanup happens during extract-min (consolidate trees to have at most one tree of each degree).
Decrease-Key (Cascading Cuts)
- Decrease the key value.
- If heap order is violated, cut the node from its parent and add it to the root list.
- If the parent was already marked (lost a child before), cut the parent too (cascading cut).
- Mark the parent (first child loss).
The marking mechanism ensures that each node loses at most 2 children before being cut itself, maintaining O(1) amortized decrease-key.
Practical Considerations
Fibonacci heaps have large constant factors and complex implementation. In practice, simpler heaps (binary, d-ary, pairing) are often faster for typical problem sizes.
Fibonacci heaps are used when: The theoretical improvement matters (very large graphs with many decrease-key operations).
Pairing Heaps
A simpler alternative to Fibonacci heaps with excellent practical performance.
Structure: A multi-ary tree with heap ordering.
| Operation | Amortized | |---|---| | insert | O(1) | | find min | O(1) | | merge | O(1) | | extract min | O(log n) | | decrease key | O(log n) amortized (exact bound is an open problem) |
Extract min: Remove root, then pair the children: merge them in pairs (left to right), then merge the results (right to left).
In practice: Faster than Fibonacci heaps for most workloads despite theoretically weaker decrease-key bound.
Leftist Heaps and Skew Heaps
Leftist Heap
A heap-ordered binary tree where the null path length (shortest path to a null node) of the left child is always ≥ the right child's.
Merge: The fundamental operation. Other operations reduce to merge. O(log n).
Skew Heap
A self-adjusting version of leftist heap. On merge, always swap children. Amortized O(log n) merge.
van Emde Boas Trees
For integer keys in range {0, 1, ..., u-1}:
| Operation | Time | |---|---| | insert, delete | O(log log u) | | find min/max | O(1) | | predecessor/successor | O(log log u) |
O(log log u): Doubly logarithmic! For u = 2³², only 5 steps.
Space: O(u). Can be reduced with hashing.
Structure: Recursive. A vEB tree of size u has √u summary trees and √u cluster trees, each of size √u. Recurrence: T(u) = T(√u) + O(1) → T(u) = O(log log u).
Applications: Priority queues for integer keys. Shortest path algorithms. Timer management in OS kernels.
Comparison
| Heap Type | insert | extract-min | decrease-key | merge | |---|---|---|---|---| | Binary | O(log n) | O(log n) | O(log n) | O(n) | | d-ary | O(log_d n) | O(d·log_d n) | O(log_d n) | O(n) | | Binomial | O(log n) | O(log n) | O(log n) | O(log n) | | Fibonacci | O(1)* | O(log n)* | O(1)* | O(1)* | | Pairing | O(1)* | O(log n)* | O(log n)* | O(1)* | | vEB | O(log log u) | O(log log u) | O(log log u) | — |
(* = amortized)
Applications in CS
- Shortest path: Dijkstra's algorithm uses a priority queue. Binary heap → O((V+E) log V). Fibonacci heap → O(E + V log V).
- Minimum spanning tree: Prim's algorithm uses a priority queue.
- Huffman coding: Build Huffman tree by repeatedly extracting two minimum-frequency nodes.
- Heap sort: O(n log n) in-place sorting.
- Event-driven simulation: Events in a priority queue ordered by time.
- Job scheduling: Priority-based CPU scheduling. Timer expiry management.
- Median finding: Two heaps (max-heap for lower half, min-heap for upper half) maintain the running median in O(log n) per insertion.
- K-way merge: Merge k sorted lists using a min-heap of size k. O(n log k).
- Top-k: Maintain a min-heap of size k. O(n log k) to find the k largest elements.