6 min read
On this page

Heaps

A heap is a tree-based data structure satisfying the heap property: each parent has higher priority than its children. Heaps efficiently support priority queue operations.

Binary Heap

Structure

A complete binary tree stored in an array. For node at index i (0-indexed):

  • Parent: (i - 1) / 2
  • Left child: 2i + 1
  • Right child: 2i + 2
Array: [90, 80, 70, 50, 60, 30, 20, 10, 40, 55]

Tree:
           90
         /    \
       80      70
      / \     / \
    50   60  30  20
   / \   /
  10 40 55

Max-Heap vs Min-Heap

Max-heap: Parent ≥ children. Root is the maximum. Min-heap: Parent ≤ children. Root is the minimum.

Operations

Operation Time
peek (find max/min) O(1)
insert O(log n)
extract max/min O(log n)
heapify (build heap) O(n)
heap sort O(n log n)

Insert (Sift Up)

  1. Add element at the end (next available position in the array).
  2. Sift up: Compare with parent. If violating heap property, swap and repeat.
PROCEDURE SIFT_UP(data, i)
    WHILE i > 0 DO
        parent ← (i - 1) / 2
        IF data[i] > data[parent] THEN
            SWAP data[i] AND data[parent]
            i ← parent
        ELSE
            BREAK

Extract Max/Min (Sift Down)

  1. Replace root with the last element. Remove last.
  2. Sift down: Compare root with children. Swap with the larger (max-heap) child. Repeat.
PROCEDURE SIFT_DOWN(data, i, n)
    LOOP
        largest ← i
        left ← 2 * i + 1
        right ← 2 * i + 2
        IF left < n AND data[left] > data[largest] THEN largest ← left
        IF right < n AND data[right] > data[largest] THEN largest ← right
        IF largest = i THEN BREAK
        SWAP data[i] AND data[largest]
        i ← largest

Build Heap (Heapify) — O(n)

Given an unordered array, build a heap in O(n) by sifting down from the last internal node to the root.

PROCEDURE BUILD_HEAP(data)
    n ← LENGTH(data)
    FOR i ← (n / 2 - 1) DOWNTO 0 DO
        SIFT_DOWN(data, i, n)

Why O(n)? Most nodes are near the bottom and sift down a short distance. Formally: Σᵢ O(height(i)) = O(n) (sum of heights is linear).

Heap Sort

  1. Build a max-heap: O(n).
  2. Repeatedly extract the max and place it at the end: O(n log n).
PROCEDURE HEAP_SORT(arr)
    n ← LENGTH(arr)
    // Build max-heap
    FOR i ← (n / 2 - 1) DOWNTO 0 DO
        SIFT_DOWN(arr, i, n)
    // Extract elements
    FOR i ← (n - 1) DOWNTO 1 DO
        SWAP arr[0] AND arr[i]
        SIFT_DOWN(arr, 0, i)

Properties: In-place. Not stable. O(n log n) worst case (better than quicksort's worst case).

d-ary Heaps

Each node has d children instead of 2.

Height: log_d(n). Lower height → fewer comparisons for sift-up. Sift down: Must compare with d children → more comparisons per level.

Optimal d: Depends on the operation mix. For decrease-key-heavy workloads (Dijkstra), d = 4 is often optimal in practice.

Binomial Heaps

A collection of binomial trees of distinct orders.

Binomial tree B_k: Has 2ᵏ nodes. Root has k children (B₀, B₁, ..., B_{k-1}).

Operations

Operation Time
insert O(log n) amortized, O(1) amortized with lazy insert
find min O(log n) or O(1) with min pointer
extract min O(log n)
merge O(log n)
decrease key O(log n)
delete O(log n)

Key advantage: O(log n) merge (vs O(n) for binary heap).

Fibonacci Heaps

The theoretically optimal heap for many algorithms.

Amortized Complexity

Operation Amortized Time
insert O(1)
find min O(1)
merge O(1)
extract min O(log n)
decrease key O(1)
delete O(log n)

Decrease-key in O(1) is the killer feature — it makes Dijkstra's algorithm O(E + V log V) instead of O(E log V).

Structure

A collection of heap-ordered trees (not necessarily binomial). Trees are linked in a circular doubly linked list.

Lazy operations: Insert and merge just add trees to the root list. Cleanup happens during extract-min (consolidate trees to have at most one tree of each degree).

Decrease-Key (Cascading Cuts)

  1. Decrease the key value.
  2. If heap order is violated, cut the node from its parent and add it to the root list.
  3. If the parent was already marked (lost a child before), cut the parent too (cascading cut).
  4. Mark the parent (first child loss).

The marking mechanism ensures that each node loses at most 2 children before being cut itself, maintaining O(1) amortized decrease-key.

Practical Considerations

Fibonacci heaps have large constant factors and complex implementation. In practice, simpler heaps (binary, d-ary, pairing) are often faster for typical problem sizes.

Fibonacci heaps are used when: The theoretical improvement matters (very large graphs with many decrease-key operations).

Pairing Heaps

A simpler alternative to Fibonacci heaps with excellent practical performance.

Structure: A multi-ary tree with heap ordering.

Operation Amortized
insert O(1)
find min O(1)
merge O(1)
extract min O(log n)
decrease key O(log n) amortized (exact bound is an open problem)

Extract min: Remove root, then pair the children: merge them in pairs (left to right), then merge the results (right to left).

In practice: Faster than Fibonacci heaps for most workloads despite theoretically weaker decrease-key bound.

Leftist Heaps and Skew Heaps

Leftist Heap

A heap-ordered binary tree where the null path length (shortest path to a null node) of the left child is always ≥ the right child's.

Merge: The fundamental operation. Other operations reduce to merge. O(log n).

Skew Heap

A self-adjusting version of leftist heap. On merge, always swap children. Amortized O(log n) merge.

van Emde Boas Trees

For integer keys in range {0, 1, ..., u-1}:

Operation Time
insert, delete O(log log u)
find min/max O(1)
predecessor/successor O(log log u)

O(log log u): Doubly logarithmic! For u = 2³², only 5 steps.

Space: O(u). Can be reduced with hashing.

Structure: Recursive. A vEB tree of size u has √u summary trees and √u cluster trees, each of size √u. Recurrence: T(u) = T(√u) + O(1) → T(u) = O(log log u).

Applications: Priority queues for integer keys. Shortest path algorithms. Timer management in OS kernels.

Comparison

Heap Type insert extract-min decrease-key merge
Binary O(log n) O(log n) O(log n) O(n)
d-ary O(log_d n) O(d·log_d n) O(log_d n) O(n)
Binomial O(log n) O(log n) O(log n) O(log n)
Fibonacci O(1)* O(log n)* O(1)* O(1)*
Pairing O(1)* O(log n)* O(log n)* O(1)*
vEB O(log log u) O(log log u) O(log log u)

(* = amortized)

Applications in CS

  • Shortest path: Dijkstra's algorithm uses a priority queue. Binary heap → O((V+E) log V). Fibonacci heap → O(E + V log V).
  • Minimum spanning tree: Prim's algorithm uses a priority queue.
  • Huffman coding: Build Huffman tree by repeatedly extracting two minimum-frequency nodes.
  • Heap sort: O(n log n) in-place sorting.
  • Event-driven simulation: Events in a priority queue ordered by time.
  • Job scheduling: Priority-based CPU scheduling. Timer expiry management.
  • Median finding: Two heaps (max-heap for lower half, min-heap for upper half) maintain the running median in O(log n) per insertion.
  • K-way merge: Merge k sorted lists using a min-heap of size k. O(n log k).
  • Top-k: Maintain a min-heap of size k. O(n log k) to find the k largest elements.