Sorting & Searching
Sorting and searching are foundational algorithms that appear in interviews both directly ("implement binary search") and indirectly (as building blocks for harder problems). Binary search, in particular, is one of the most versatile techniques in your toolkit — it applies far beyond sorted arrays. Knowing when and how to apply these algorithms is more important than memorizing their implementations.
Binary Search
Binary search finds a target value in a sorted array in O(log n) time by repeatedly halving the search space.
Basic binary search:
left = 0
right = len(arr) - 1
while left <= right:
mid = left + (right - left) // 2 # avoids integer overflow
if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1 # not found
The implementation looks simple, but getting the boundary conditions right is where most candidates fail. The three critical decisions are:
1. Is right initialized to len(arr) - 1 or len(arr)?
2. Is the loop condition left <= right or left < right?
3. Do we update with mid + 1 and mid - 1, or mid + 1 and mid?
These must be consistent. The most common combination:
right = len(arr) - 1
while left <= right
left = mid + 1, right = mid - 1
Alternative (half-open interval):
right = len(arr)
while left < right
left = mid + 1, right = mid
Binary Search Variants
The real power of binary search in interviews is its variants. Most interviewers do not ask for plain binary search — they ask for one of these:
First occurrence of target:
left = 0
right = len(arr) - 1
result = -1
while left <= right:
mid = left + (right - left) // 2
if arr[mid] == target:
result = mid # record it, but keep searching left
right = mid - 1
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return result
Last occurrence of target: Same structure, but when you find target, search right instead:
if arr[mid] == target:
result = mid
left = mid + 1 # keep searching right
Search in a rotated sorted array:
Input: [4, 5, 6, 7, 0, 1, 2], target = 0
The array is sorted but rotated. One half is always sorted.
while left <= right:
mid = left + (right - left) // 2
if arr[mid] == target:
return mid
if arr[left] <= arr[mid]: # left half is sorted
if arr[left] <= target < arr[mid]:
right = mid - 1 # target is in left half
else:
left = mid + 1 # target is in right half
else: # right half is sorted
if arr[mid] < target <= arr[right]:
left = mid + 1 # target is in right half
else:
right = mid - 1 # target is in left half
Binary search on answer space: This is the most powerful variant. Instead of searching in an array, you search across a range of possible answers.
Problem: Find the minimum capacity to ship packages within D days.
Approach: Binary search on the capacity value.
- Minimum possible capacity: max(weights) (must fit the heaviest package)
- Maximum possible capacity: sum(weights) (ship everything in one day)
- For each candidate capacity, check if it works in D days
left = max(weights)
right = sum(weights)
while left < right:
mid = left + (right - left) // 2
if can_ship_in_d_days(mid, weights, D):
right = mid # try a smaller capacity
else:
left = mid + 1 # need more capacity
return left
This pattern — binary search on the answer — appears in many problems: minimizing maximum, finding the kth element, splitting arrays with constraints.
Sorting Algorithms
You rarely need to implement a full sorting algorithm in an interview. But you need to know the properties of each to make the right choice when sorting is part of a larger solution.
Algorithm Time (avg) Time (worst) Space Stable
──────────────────────────────────────────────────────────────
Merge sort O(n log n) O(n log n) O(n) Yes
Quick sort O(n log n) O(n^2) O(log n) No
Heap sort O(n log n) O(n log n) O(1) No
Counting sort O(n + k) O(n + k) O(k) Yes
Radix sort O(d * n) O(d * n) O(n) Yes
Merge Sort
Merge sort is the go-to when you need guaranteed O(n log n) or a stable sort. It is also the preferred algorithm for sorting linked lists (because it does not need random access).
Merge sort on a linked list:
1. Find the middle using slow/fast pointers
2. Split the list into two halves
3. Recursively sort each half
4. Merge the two sorted halves
def merge_sort(head):
if head is None or head.next is None:
return head
mid = find_middle(head)
right_half = mid.next
mid.next = None # split the list
left = merge_sort(head)
right = merge_sort(right_half)
return merge(left, right)
Time: O(n log n). Space: O(log n) for recursion stack.
No extra array needed (unlike array merge sort).
Quick Select
Quick select finds the kth smallest (or largest) element in O(n) average time. It uses the partitioning step from quicksort but only recurses into one half.
Problem: Find the kth largest element in an unsorted array.
def quick_select(arr, left, right, k):
pivot_index = partition(arr, left, right)
if pivot_index == k:
return arr[pivot_index]
elif pivot_index < k:
return quick_select(arr, pivot_index + 1, right, k)
else:
return quick_select(arr, left, pivot_index - 1, k)
Partition places the pivot in its correct sorted position.
Elements smaller than pivot go left; larger go right.
Average: O(n). Worst: O(n^2) with bad pivot choices.
To guarantee O(n), use median-of-medians pivot selection
(but this is rarely asked in interviews).
Quick select is preferable to sorting when you only need the kth element, since it avoids the full O(n log n) cost.
When O(n log n) Is Not Fast Enough
Sometimes the problem constraints or follow-up questions demand better than O(n log n):
Technique When to use Complexity
────────────────────────────────────────────────────────────
Counting sort Small range of integer values O(n + k)
Radix sort Fixed-length integers/strings O(d * n)
Bucket sort Uniformly distributed values O(n) avg
Quick select Only need kth element O(n) avg
Two pointers Already sorted, need pairs O(n)
Hash map Need frequency counting O(n)
If the interviewer asks "can you do better than O(n log n)?", the answer usually involves one of these approaches — or rethinking whether you need to sort at all.
Custom Comparators
Many problems require sorting by a custom rule. This is straightforward but easy to get wrong.
Problem: Sort a list of strings by their numeric value.
Input: ["10", "2", "30"]
Default sort: ["10", "2", "30"] (lexicographic)
Desired sort: ["2", "10", "30"] (numeric)
sorted(strings, key=lambda s: int(s))
Problem: Sort intervals by start time, breaking ties by end time.
sorted(intervals, key=lambda x: (x[0], x[1]))
Problem: Arrange numbers to form the largest possible number.
Input: [3, 30, 34, 5, 9]
Output: "9534330"
The comparator: given two numbers a and b, compare "ab" vs "ba".
Is "330" > "303"? Yes, so 3 comes before 30.
from functools import cmp_to_key
def compare(a, b):
if str(a) + str(b) > str(b) + str(a):
return -1
return 1
sorted(nums, key=cmp_to_key(compare))
Practical Interview Applications
Finding Duplicates
Approach 1: Sort and scan adjacent elements. O(n log n) time, O(1) space.
arr.sort()
for i in range(1, len(arr)):
if arr[i] == arr[i-1]:
return arr[i]
Approach 2: Hash set. O(n) time, O(n) space.
seen = set()
for num in arr:
if num in seen:
return num
seen.add(num)
Merge Intervals
Problem: Merge overlapping intervals.
Input: [[1,3], [2,6], [8,10], [15,18]]
Output: [[1,6], [8,10], [15,18]]
Sort by start time, then merge greedily:
intervals.sort(key=lambda x: x[0])
merged = [intervals[0]]
for start, end in intervals[1:]:
if start <= merged[-1][1]:
merged[-1][1] = max(merged[-1][1], end)
else:
merged.append([start, end])
Two Sum Variants with Sorting
Problem: Find all unique pairs that sum to target.
Sort the array, then use two pointers:
arr.sort()
left = 0
right = len(arr) - 1
results = []
while left < right:
current_sum = arr[left] + arr[right]
if current_sum == target:
results.append([arr[left], arr[right]])
left += 1
right -= 1
while left < right and arr[left] == arr[left-1]:
left += 1 # skip duplicates
elif current_sum < target:
left += 1
else:
right -= 1
Binary Search Decision Framework
When facing a problem in an interview, ask yourself:
Is the input sorted or can it be sorted?
-> Classic binary search or sort + binary search
Is there a monotonic relationship between input and output?
-> Binary search on the answer space
Do you need the kth element?
-> Quick select (O(n)) or sort (O(n log n))
Does the problem ask for "minimum maximum" or "maximum minimum"?
-> Binary search on the answer space
Is the search space too large for linear scan?
-> Binary search
Common Pitfalls
- Off-by-one errors in binary search: The boundary conditions (left vs right, inclusive vs exclusive, mid +/- 1) are the most common source of bugs. Always trace through a 3-element example to verify.
- Integer overflow in mid calculation:
(left + right) / 2can overflow in languages with fixed-size integers. Useleft + (right - left) / 2instead. - Forgetting to handle duplicates: When searching for the first or last occurrence, a naive binary search returns any occurrence. You must continue searching after finding a match.
- Using sort when you do not need to: If you only need the kth element, quick select is O(n) average. If you only need the top k, a heap of size k is O(n log k). Sorting the entire array is often overkill.
- Not recognizing binary search on answer space: Many problems that do not look like search problems (minimum capacity, maximum distance, splitting arrays) are solvable with binary search on the answer. If the answer has a monotonic feasibility property, binary search applies.
- Infinite loops in binary search: If your update rules do not guarantee the search space shrinks every iteration, you will loop forever. Ensure that left or right moves toward the other in every branch.
- Unstable sort when order matters: If equal elements must maintain their original order, use a stable sort (merge sort) or add the original index as a tiebreaker.
Key Takeaways
- Binary search is not just for finding elements in sorted arrays. Its most powerful application is searching on the answer space — any time a monotonic feasibility function exists, binary search applies.
- Get the binary search boundary conditions right. Pick one convention (closed interval with
left <= rightor half-open withleft < right) and practice it until it is automatic. - Quick select finds the kth element in O(n) average time. Use it when you need a single order statistic, not the full sorted result.
- Merge sort is the right choice for linked lists and when you need stability. Quick sort is the right choice for in-place array sorting when stability does not matter.
- Custom comparators are a frequent interview requirement. Know how to sort by arbitrary criteria in your language of choice.
- When O(n log n) is not fast enough, consider counting sort, radix sort, or whether sorting is even necessary — sometimes a hash map or two pointers on sorted data gets you to O(n).