5 min read
On this page

Binary Trees

Binary trees are hierarchical structures where each node has at most two children. They provide the foundation for search trees, expression trees, and many algorithms.

Hierarchy of tree types including BST, AVL, Red-Black, B-tree, and more

Traversals

Depth-First Traversals

In-order (Left, Root, Right): Visits nodes in sorted order for BSTs.

     4
    / \
   2   6
  / \ / \
 1  3 5  7
In-order: 1, 2, 3, 4, 5, 6, 7

Pre-order (Root, Left, Right): Root is visited first. Used for serialization, copying.

Pre-order: 4, 2, 1, 3, 6, 5, 7

Post-order (Left, Right, Root): Root is visited last. Used for deletion, expression evaluation.

Post-order: 1, 3, 2, 5, 7, 6, 4

Level-order (BFS): Visit level by level, left to right. Uses a queue.

Level-order: 4, 2, 6, 1, 3, 5, 7

Morris Traversal

In-order traversal with O(1) space (no stack, no recursion). Temporarily modifies the tree using threaded pointers.

Algorithm:

  1. If no left child: visit current, go right.
  2. If left child exists: Find the in-order predecessor (rightmost node in left subtree).
    • If predecessor's right is null: set it to current (create thread), go left.
    • If predecessor's right is current: remove thread, visit current, go right.

Time: O(n). Space: O(1). Tree is restored to original state.

Binary Search Trees (BST)

A BST maintains the BST property: For every node, all values in the left subtree are less, and all values in the right subtree are greater.

Operations

| Operation | Average | Worst | |---|---|---| | Search | O(log n) | O(n) | | Insert | O(log n) | O(n) | | Delete | O(log n) | O(n) | | Min/Max | O(log n) | O(n) | | Successor/Predecessor | O(log n) | O(n) |

Worst case occurs when the tree degenerates into a linked list (e.g., inserting sorted data).

Deletion

Three cases:

  1. Leaf: Simply remove.
  2. One child: Replace with the child.
  3. Two children: Replace with the in-order successor (smallest in right subtree) or predecessor (largest in left subtree). Then delete the successor/predecessor.

Self-Balancing BSTs

Automatically maintain O(log n) height after insertions and deletions.

AVL Trees

Balance factor = height(left) - height(right). Must be {-1, 0, 1} for every node.

Rotations to restore balance after insert/delete:

Right Rotation (for left-heavy):

      z              y
     / \            / \
    y   T4   →    x    z
   / \           / \  / \
  x   T3        T1 T2 T3 T4
 / \
T1  T2

Left Rotation (mirror of right rotation).

Four cases:

  • Left-Left: Right rotation.
  • Right-Right: Left rotation.
  • Left-Right: Left rotation on child, then right rotation.
  • Right-Left: Right rotation on child, then left rotation.

Properties: Height ≤ 1.44 log₂(n+2). Strictly balanced (height difference ≤ 1). Faster lookup than red-black trees (shorter height). More rotations on insert/delete.

Red-Black Trees

Each node is colored red or black. Rules:

  1. Root is black.
  2. Leaves (NIL) are black.
  3. Red node's children are both black (no consecutive reds).
  4. Every path from root to leaf has the same number of black nodes (black-height).

Properties: Height ≤ 2 log₂(n+1). Less strictly balanced than AVL. Fewer rotations on insert/delete (at most 2 rotations for insert, 3 for delete).

Insert fixup: Recolor and rotate (at most 2 rotations + O(log n) recoloring).

Delete fixup: More complex. Recolor and rotate (at most 3 rotations).

Used in: Linux kernel (CFS scheduler, memory maps), Java TreeMap, C++ std::map/std::set, Rust BTreeMap (uses B-tree actually, but red-black is the classic choice).

Splay Trees

Self-adjusting: Recently accessed nodes are moved to the root via splaying (a sequence of rotations).

Splay operation: Move node x to the root using rotations:

  • Zig: x is child of root → single rotation.
  • Zig-zig: x and parent are both left (or both right) children → two rotations (same direction).
  • Zig-zag: x is left child, parent is right child (or vice versa) → two rotations (different directions).

Amortized O(log n) per operation. No extra storage (no balance factor or color).

Properties:

  • Adapts to access patterns (frequently accessed items stay near root).
  • Working set property: If only k distinct items are accessed, operations take O(log k).
  • Static optimality: Within O(1) of the best static BST for any access sequence.
  • Worst case for a single operation: O(n).

Treaps

Combine BST property (on keys) with heap property (on random priorities).

Each node has a key (BST order) and a random priority (heap order — parent priority > children priorities).

Insert: BST insert, then rotate up to maintain heap property. Delete: Rotate down (like removing from a heap), then remove.

Expected height: O(log n) (random priorities ensure balance with high probability).

Advantages: Simple to implement. Randomized — no worst case for any key sequence. Easy split/merge operations.

Scapegoat Trees

Maintain a loose balance invariant: height ≤ log_{3/2}(n). When a node is inserted and its depth exceeds this bound, find a scapegoat — an ancestor where the subtree is sufficiently unbalanced — and rebuild that subtree perfectly balanced.

Amortized O(log n). No extra storage per node (no color, balance factor, or priority). Simple rebuild operation.

Tree Properties

Height: Longest path from root to leaf.

  • Balanced BST: Θ(log n)
  • Degenerate: Θ(n)

Number of nodes: For a full binary tree of height h: 2^(h+1) - 1.

Number of leaves: For a full binary tree: 2^h.

Relation: For any binary tree, leaves = internal_nodes + 1 (for full trees).

Augmented Trees

Add extra information to tree nodes to support additional operations.

Order-statistic tree: Each node stores the size of its subtree. Supports:

  • Select(k): Find the k-th smallest element in O(log n).
  • Rank(x): Find the rank of element x in O(log n).

Interval tree: Augmented BST for interval operations (covered in advanced data structures).

Applications in CS

  • Databases: B-trees and variants for indexing (covered separately). Red-black trees for in-memory indexes.
  • Language runtimes: Balanced BSTs for ordered maps/sets (C++ std::map, Java TreeMap).
  • Operating systems: Red-black trees for scheduling (CFS), memory management (VM areas), I/O scheduling.
  • Compilers: Abstract syntax trees (ASTs). Expression trees for code generation.
  • File systems: Directory trees. Btrfs uses B-trees extensively.
  • Networking: IP routing tables, firewall rules (interval trees).
  • Computational geometry: Range trees, k-d trees (covered separately).