5 min read
On this page

Sorting & Searching

Sorting and searching are foundational algorithms that appear in interviews both directly ("implement binary search") and indirectly (as building blocks for harder problems). Binary search, in particular, is one of the most versatile techniques in your toolkit — it applies far beyond sorted arrays. Knowing when and how to apply these algorithms is more important than memorizing their implementations.

Binary search finds a target value in a sorted array in O(log n) time by repeatedly halving the search space.

Basic binary search:
  left = 0
  right = len(arr) - 1

  while left <= right:
      mid = left + (right - left) // 2    # avoids integer overflow
      if arr[mid] == target:
          return mid
      elif arr[mid] < target:
          left = mid + 1
      else:
          right = mid - 1

  return -1   # not found

The implementation looks simple, but getting the boundary conditions right is where most candidates fail. The three critical decisions are:

1. Is right initialized to len(arr) - 1 or len(arr)?
2. Is the loop condition left <= right or left < right?
3. Do we update with mid + 1 and mid - 1, or mid + 1 and mid?

These must be consistent. The most common combination:
  right = len(arr) - 1
  while left <= right
  left = mid + 1, right = mid - 1

Alternative (half-open interval):
  right = len(arr)
  while left < right
  left = mid + 1, right = mid

Binary Search Variants

The real power of binary search in interviews is its variants. Most interviewers do not ask for plain binary search — they ask for one of these:

First occurrence of target:

  left = 0
  right = len(arr) - 1
  result = -1

  while left <= right:
      mid = left + (right - left) // 2
      if arr[mid] == target:
          result = mid        # record it, but keep searching left
          right = mid - 1
      elif arr[mid] < target:
          left = mid + 1
      else:
          right = mid - 1

  return result

Last occurrence of target: Same structure, but when you find target, search right instead:

      if arr[mid] == target:
          result = mid
          left = mid + 1     # keep searching right

Search in a rotated sorted array:

  Input: [4, 5, 6, 7, 0, 1, 2], target = 0

  The array is sorted but rotated. One half is always sorted.

  while left <= right:
      mid = left + (right - left) // 2
      if arr[mid] == target:
          return mid

      if arr[left] <= arr[mid]:          # left half is sorted
          if arr[left] <= target < arr[mid]:
              right = mid - 1            # target is in left half
          else:
              left = mid + 1             # target is in right half
      else:                              # right half is sorted
          if arr[mid] < target <= arr[right]:
              left = mid + 1             # target is in right half
          else:
              right = mid - 1            # target is in left half

Binary search on answer space: This is the most powerful variant. Instead of searching in an array, you search across a range of possible answers.

  Problem: Find the minimum capacity to ship packages within D days.

  Approach: Binary search on the capacity value.
    - Minimum possible capacity: max(weights) (must fit the heaviest package)
    - Maximum possible capacity: sum(weights) (ship everything in one day)
    - For each candidate capacity, check if it works in D days

  left = max(weights)
  right = sum(weights)

  while left < right:
      mid = left + (right - left) // 2
      if can_ship_in_d_days(mid, weights, D):
          right = mid          # try a smaller capacity
      else:
          left = mid + 1       # need more capacity

  return left

This pattern — binary search on the answer — appears in many problems: minimizing maximum, finding the kth element, splitting arrays with constraints.

Sorting Algorithms

You rarely need to implement a full sorting algorithm in an interview. But you need to know the properties of each to make the right choice when sorting is part of a larger solution.

Algorithm      Time (avg)    Time (worst)   Space    Stable
──────────────────────────────────────────────────────────────
Merge sort     O(n log n)    O(n log n)     O(n)     Yes
Quick sort     O(n log n)    O(n^2)         O(log n) No
Heap sort      O(n log n)    O(n log n)     O(1)     No
Counting sort  O(n + k)      O(n + k)       O(k)     Yes
Radix sort     O(d * n)      O(d * n)       O(n)     Yes

Merge Sort

Merge sort is the go-to when you need guaranteed O(n log n) or a stable sort. It is also the preferred algorithm for sorting linked lists (because it does not need random access).

Merge sort on a linked list:
  1. Find the middle using slow/fast pointers
  2. Split the list into two halves
  3. Recursively sort each half
  4. Merge the two sorted halves

  def merge_sort(head):
      if head is None or head.next is None:
          return head

      mid = find_middle(head)
      right_half = mid.next
      mid.next = None          # split the list

      left = merge_sort(head)
      right = merge_sort(right_half)
      return merge(left, right)

Time: O(n log n). Space: O(log n) for recursion stack.
No extra array needed (unlike array merge sort).

Quick Select

Quick select finds the kth smallest (or largest) element in O(n) average time. It uses the partitioning step from quicksort but only recurses into one half.

Problem: Find the kth largest element in an unsorted array.

  def quick_select(arr, left, right, k):
      pivot_index = partition(arr, left, right)

      if pivot_index == k:
          return arr[pivot_index]
      elif pivot_index < k:
          return quick_select(arr, pivot_index + 1, right, k)
      else:
          return quick_select(arr, left, pivot_index - 1, k)

  Partition places the pivot in its correct sorted position.
  Elements smaller than pivot go left; larger go right.

  Average: O(n). Worst: O(n^2) with bad pivot choices.
  To guarantee O(n), use median-of-medians pivot selection
  (but this is rarely asked in interviews).

Quick select is preferable to sorting when you only need the kth element, since it avoids the full O(n log n) cost.

When O(n log n) Is Not Fast Enough

Sometimes the problem constraints or follow-up questions demand better than O(n log n):

Technique         When to use                    Complexity
────────────────────────────────────────────────────────────
Counting sort     Small range of integer values   O(n + k)
Radix sort        Fixed-length integers/strings    O(d * n)
Bucket sort       Uniformly distributed values     O(n) avg
Quick select      Only need kth element            O(n) avg
Two pointers      Already sorted, need pairs       O(n)
Hash map          Need frequency counting          O(n)

If the interviewer asks "can you do better than O(n log n)?", the answer usually involves one of these approaches — or rethinking whether you need to sort at all.

Custom Comparators

Many problems require sorting by a custom rule. This is straightforward but easy to get wrong.

Problem: Sort a list of strings by their numeric value.
  Input: ["10", "2", "30"]
  Default sort: ["10", "2", "30"] (lexicographic)
  Desired sort: ["2", "10", "30"] (numeric)

  sorted(strings, key=lambda s: int(s))

Problem: Sort intervals by start time, breaking ties by end time.
  sorted(intervals, key=lambda x: (x[0], x[1]))

Problem: Arrange numbers to form the largest possible number.
  Input: [3, 30, 34, 5, 9]
  Output: "9534330"

  The comparator: given two numbers a and b, compare "ab" vs "ba".
  Is "330" > "303"? Yes, so 3 comes before 30.

  from functools import cmp_to_key
  def compare(a, b):
      if str(a) + str(b) > str(b) + str(a):
          return -1
      return 1
  sorted(nums, key=cmp_to_key(compare))

Practical Interview Applications

Finding Duplicates

Approach 1: Sort and scan adjacent elements. O(n log n) time, O(1) space.
  arr.sort()
  for i in range(1, len(arr)):
      if arr[i] == arr[i-1]:
          return arr[i]

Approach 2: Hash set. O(n) time, O(n) space.
  seen = set()
  for num in arr:
      if num in seen:
          return num
      seen.add(num)

Merge Intervals

Problem: Merge overlapping intervals.
  Input: [[1,3], [2,6], [8,10], [15,18]]
  Output: [[1,6], [8,10], [15,18]]

  Sort by start time, then merge greedily:
  intervals.sort(key=lambda x: x[0])
  merged = [intervals[0]]

  for start, end in intervals[1:]:
      if start <= merged[-1][1]:
          merged[-1][1] = max(merged[-1][1], end)
      else:
          merged.append([start, end])

Two Sum Variants with Sorting

Problem: Find all unique pairs that sum to target.

  Sort the array, then use two pointers:
  arr.sort()
  left = 0
  right = len(arr) - 1
  results = []

  while left < right:
      current_sum = arr[left] + arr[right]
      if current_sum == target:
          results.append([arr[left], arr[right]])
          left += 1
          right -= 1
          while left < right and arr[left] == arr[left-1]:
              left += 1     # skip duplicates
      elif current_sum < target:
          left += 1
      else:
          right -= 1

Binary Search Decision Framework

When facing a problem in an interview, ask yourself:

Is the input sorted or can it be sorted?
  -> Classic binary search or sort + binary search

Is there a monotonic relationship between input and output?
  -> Binary search on the answer space

Do you need the kth element?
  -> Quick select (O(n)) or sort (O(n log n))

Does the problem ask for "minimum maximum" or "maximum minimum"?
  -> Binary search on the answer space

Is the search space too large for linear scan?
  -> Binary search

Common Pitfalls

  • Off-by-one errors in binary search: The boundary conditions (left vs right, inclusive vs exclusive, mid +/- 1) are the most common source of bugs. Always trace through a 3-element example to verify.
  • Integer overflow in mid calculation: (left + right) / 2 can overflow in languages with fixed-size integers. Use left + (right - left) / 2 instead.
  • Forgetting to handle duplicates: When searching for the first or last occurrence, a naive binary search returns any occurrence. You must continue searching after finding a match.
  • Using sort when you do not need to: If you only need the kth element, quick select is O(n) average. If you only need the top k, a heap of size k is O(n log k). Sorting the entire array is often overkill.
  • Not recognizing binary search on answer space: Many problems that do not look like search problems (minimum capacity, maximum distance, splitting arrays) are solvable with binary search on the answer. If the answer has a monotonic feasibility property, binary search applies.
  • Infinite loops in binary search: If your update rules do not guarantee the search space shrinks every iteration, you will loop forever. Ensure that left or right moves toward the other in every branch.
  • Unstable sort when order matters: If equal elements must maintain their original order, use a stable sort (merge sort) or add the original index as a tiebreaker.

Key Takeaways

  • Binary search is not just for finding elements in sorted arrays. Its most powerful application is searching on the answer space — any time a monotonic feasibility function exists, binary search applies.
  • Get the binary search boundary conditions right. Pick one convention (closed interval with left <= right or half-open with left < right) and practice it until it is automatic.
  • Quick select finds the kth element in O(n) average time. Use it when you need a single order statistic, not the full sorted result.
  • Merge sort is the right choice for linked lists and when you need stability. Quick sort is the right choice for in-place array sorting when stability does not matter.
  • Custom comparators are a frequent interview requirement. Know how to sort by arbitrary criteria in your language of choice.
  • When O(n log n) is not fast enough, consider counting sort, radix sort, or whether sorting is even necessary — sometimes a hash map or two pointers on sorted data gets you to O(n).