Trees & Graphs

Trees and graphs show up in roughly a third of coding interviews. They test your ability to think recursively, manage traversal state, and recognize when a problem that does not look like a tree or graph problem actually is one. Many engineers find these topics intimidating, but the core patterns are surprisingly few. Master the traversals, learn to recognize the patterns, and most tree/graph problems become manageable.

Binary Trees

A binary tree is a node with at most two children. The key operations are traversals — visiting every node in a specific order.

Traversal Orders

Given this tree:
        1
       / \
      2   3
     / \   \
    4   5   6

In-order (left, root, right):   4, 2, 5, 1, 3, 6
Pre-order (root, left, right):  1, 2, 4, 5, 3, 6
Post-order (left, right, root): 4, 5, 2, 6, 3, 1
Level-order (BFS):              1, 2, 3, 4, 5, 6

When to use each:

In-order:    BST problems (gives sorted output for BST)
Pre-order:   Serialize a tree, copy a tree
Post-order:  Delete a tree, compute heights, bottom-up aggregation
Level-order: Problems involving levels, layers, or shortest paths

DFS (Depth-First Search) on Trees

DFS explores as deep as possible before backtracking. It is naturally recursive.

Recursive DFS (pre-order):
  def dfs(node):
      if node is None:
          return
      process(node)       # pre-order: process before children
      dfs(node.left)
      dfs(node.right)

Iterative DFS (using a stack):
  stack = [root]
  while stack:
      node = stack.pop()
      process(node)
      if node.right:      # push right first so left is processed first
          stack.push(node.right)
      if node.left:
          stack.push(node.left)

Most interview tree problems use recursive DFS because the code is shorter and clearer. Use iterative DFS if the interviewer asks about stack overflow risk with very deep trees or if you are more comfortable with it.

BFS (Breadth-First Search) on Trees

BFS explores all nodes at depth d before any at depth d+1. It uses a queue.

Level-order traversal:
  queue = [root]
  while queue:
      level_size = len(queue)
      current_level = []
      for i in range(level_size):
          node = queue.popleft()
          current_level.append(node.val)
          if node.left:
              queue.append(node.left)
          if node.right:
              queue.append(node.right)
      result.append(current_level)

Output for the example tree: [[1], [2, 3], [4, 5, 6]]

The level_size trick is essential — it lets you process one level at a time by snapshotting the queue size before processing.

Binary Search Trees (BST)

A BST has the property: for every node, all values in its left subtree are smaller, and all values in its right subtree are larger. This gives O(log n) search, insert, and delete for balanced BSTs.

BST operations (balanced):
  Search:   O(log n)
  Insert:   O(log n)
  Delete:   O(log n)
  In-order: O(n) — produces sorted output

BST operations (degenerate/skewed):
  All operations degrade to O(n)

Key BST insight for interviews: an in-order traversal of a BST produces elements in sorted order. Many BST problems exploit this property.

Problem: Validate if a tree is a valid BST.

Approach: Track the valid range for each node.

  def is_valid_bst(node, min_val, max_val):
      if node is None:
          return True
      if node.val <= min_val or node.val >= max_val:
          return False
      return (is_valid_bst(node.left, min_val, node.val) and
              is_valid_bst(node.right, node.val, max_val))

  is_valid_bst(root, -infinity, +infinity)

A common mistake is only checking the immediate parent constraint
(node.left < node < node.right) without propagating the range.
The value 7 in a left subtree rooted at 5 is invalid even if
its immediate parent is 6, because 7 > 5.

Common Tree Patterns

Path Sum

Problem: Does any root-to-leaf path have a sum equal to target?

  def has_path_sum(node, target):
      if node is None:
          return False
      if node.left is None and node.right is None:
          return node.val == target
      return (has_path_sum(node.left, target - node.val) or
              has_path_sum(node.right, target - node.val))

Key insight: subtract the current node's value from the target
as you go. When you reach a leaf, check if the remaining target
is zero (or equals the leaf value).

Lowest Common Ancestor (LCA)

Problem: Find the lowest common ancestor of two nodes in a binary tree.

  def lca(root, p, q):
      if root is None or root == p or root == q:
          return root
      left = lca(root.left, p, q)
      right = lca(root.right, p, q)
      if left and right:
          return root       # p and q are on different sides
      return left or right  # both are on the same side

This is one of the most elegant recursive solutions in all of
interview prep. Understand why it works: if both sides return
non-null, the current node is the LCA. If only one side returns
non-null, the LCA is in that subtree.

Maximum Depth

  def max_depth(node):
      if node is None:
          return 0
      return 1 + max(max_depth(node.left), max_depth(node.right))

This is the template for many tree problems: base case returns
a value, recursive case combines children's results.

Graphs

A graph is a set of nodes (vertices) connected by edges. Trees are a special case of graphs (connected, acyclic). Many real-world problems are graph problems: social networks, maps, dependencies, web crawling.

Representations

Adjacency List (most common in interviews):
  graph = {
      'A': ['B', 'C'],
      'B': ['A', 'D'],
      'C': ['A', 'D'],
      'D': ['B', 'C']
  }
  Space: O(V + E)
  Check if edge exists: O(degree)
  Iterate neighbors: O(degree)

Adjacency Matrix:
       A  B  C  D
  A  [ 0, 1, 1, 0 ]
  B  [ 1, 0, 0, 1 ]
  C  [ 1, 0, 0, 1 ]
  D  [ 0, 1, 1, 0 ]
  Space: O(V^2)
  Check if edge exists: O(1)
  Iterate neighbors: O(V)

Use adjacency lists unless the graph is dense (close to V^2 edges) or you need O(1) edge existence checks. In interviews, adjacency lists are almost always the right choice.

Graph BFS

BFS finds the shortest path in an unweighted graph. This is its primary interview use case.

  def bfs(graph, start):
      visited = set([start])
      queue = [start]
      while queue:
          node = queue.popleft()
          process(node)
          for neighbor in graph[node]:
              if neighbor not in visited:
                  visited.add(neighbor)
                  queue.append(neighbor)

The visited set is critical. Without it, you will loop forever
in a graph with cycles.

Graph DFS

DFS is useful for connectivity checks, topological sorting, cycle detection, and many other graph problems.

  def dfs(graph, node, visited):
      visited.add(node)
      process(node)
      for neighbor in graph[node]:
          if neighbor not in visited:
              dfs(graph, neighbor, visited)

When Problems Are Graph Problems in Disguise

This is the hardest skill to develop and the most valuable. Many problems that do not mention "graph" are graph problems:

Problem type                     Graph formulation
──────────────────────────────────────────────────────────
"Find if a path exists..."       BFS/DFS on implicit graph
Word ladder (change one letter)  BFS where each word is a node
Matrix traversal (islands)       Grid = graph, cells = nodes
Course prerequisites             Directed graph, topological sort
Social connections (friend of    BFS for shortest path
  friend distance)
State machines                   Nodes = states, edges = transitions

The grid-as-graph pattern is especially common:

Problem: Count the number of islands in a 2D grid.

  Grid:
  1 1 0 0
  1 0 0 1
  0 0 1 1

  Each '1' is a node. Adjacent '1's (up/down/left/right) share
  an edge. An island is a connected component.

  def count_islands(grid):
      count = 0
      for row in range(rows):
          for col in range(cols):
              if grid[row][col] == 1:
                  count += 1
                  bfs_or_dfs_to_mark_visited(grid, row, col)
      return count

  Answer: 3 islands

Choosing BFS vs DFS

Use BFS when:
  - Finding shortest path in unweighted graph
  - Level-order processing (nearest neighbors first)
  - Problems ask for "minimum number of steps/moves"

Use DFS when:
  - Checking if a path exists (either works, but DFS is simpler)
  - Topological sorting
  - Detecting cycles
  - Exhaustive search (all paths, all solutions)
  - Problems involving backtracking

Topological Sort

For directed acyclic graphs (DAGs), topological sort orders nodes so that every edge goes from earlier to later. Classic use: course scheduling.

Problem: Given prerequisites, find a valid course order.

  Kahn's Algorithm (BFS-based):
  1. Compute in-degree for every node
  2. Add all nodes with in-degree 0 to queue
  3. While queue is not empty:
     - Remove a node, add to result
     - For each neighbor, decrement in-degree
     - If in-degree becomes 0, add to queue
  4. If result has all nodes, return it. Otherwise, cycle exists.

  If the graph has a cycle, topological sort is impossible.
  This is how you detect cycles in a directed graph.

Tree vs Graph: Key Differences

                    Tree                    Graph
──────────────────────────────────────────────────────
Cycles              Never                   Possible
Root                Exactly one             No concept
Parent per node     Exactly one (except root) Multiple possible
Visited tracking    Not needed              Essential
Connected           Always (by definition)  Not necessarily

The biggest mistake when moving from tree to graph problems: forgetting to track visited nodes. In a tree, you naturally do not revisit nodes because you move from parent to child. In a graph, cycles mean you will loop infinitely without a visited set.

Common Pitfalls

Forgetting the visited set in graphs: This causes infinite loops in cyclic graphs and is the most common graph bug in interviews.
Confusing BFS and DFS: BFS uses a queue, DFS uses a stack (or recursion). Mixing them up gives wrong results for shortest-path problems.
Off-by-one in level-order traversal: Forgetting to snapshot the queue size at the start of each level means you mix up which nodes belong to which level.
Not handling disconnected graphs: If the graph might be disconnected, you need to run BFS/DFS from every unvisited node, not just from one starting point.
BST validation checking only parent: Checking left < node < right is insufficient. You must propagate the valid range through the entire subtree.
Assuming tree problems need iteration: Recursive solutions for tree problems are almost always shorter and clearer. Default to recursion unless there is a specific reason not to.
Ignoring edge cases: Empty tree (null root), single-node tree, tree with only left children (degenerate), graph with no edges.

Key Takeaways

Master the four traversals (in-order, pre-order, post-order, level-order). Most tree problems are variations of one of these.
In-order traversal of a BST gives sorted output. This property is the basis for many BST interview problems.
BFS finds shortest paths in unweighted graphs. If the problem asks for "minimum steps" or "minimum distance," BFS is almost always the answer.
Many problems are graphs in disguise: grids, word transformations, state transitions, and dependency orderings. Recognizing the graph structure is half the battle.
Always use a visited set for graph traversals. Always. Forgetting it is the most common graph bug and will cost you the problem in an interview.
The LCA algorithm and the BST validation algorithm are two of the most frequently asked tree problems. Understand them deeply, not just memorize them.