Searching
Searching algorithms find elements in collections. The right algorithm depends on whether the data is sorted, the access pattern, and the data structure.
Linear Search
Scan every element sequentially.
FUNCTION LINEAR_SEARCH(arr, target)
FOR i ← 0 TO length(arr) - 1
IF arr[i] = target
RETURN i
RETURN NOT_FOUND
Time: O(n). Space: O(1). Works on unsorted data.
When to use: Small arrays (n < ~32), unsorted data, or data that can't be sorted/indexed.
Sentinel optimization: Place the target at the end of the array. Eliminates the bounds check in the loop. Minor speedup.
Binary Search
For sorted arrays. Repeatedly halve the search space.
FUNCTION BINARY_SEARCH(arr, target)
lo ← 0, hi ← length(arr)
WHILE lo < hi
mid ← lo + (hi - lo) / 2 // avoid overflow vs (lo + hi) / 2
IF arr[mid] = target
RETURN mid
ELSE IF arr[mid] < target
lo ← mid + 1
ELSE
hi ← mid
RETURN NOT_FOUND
Time: O(log n). Space: O(1).
Variants
Lower Bound
Find the first position where arr[i] ≥ target (insertion point for target).
FUNCTION LOWER_BOUND(arr, target)
lo ← 0, hi ← length(arr)
WHILE lo < hi
mid ← lo + (hi - lo) / 2
IF arr[mid] < target
lo ← mid + 1
ELSE
hi ← mid
RETURN lo
Upper Bound
Find the first position where arr[i] > target.
FUNCTION UPPER_BOUND(arr, target)
lo ← 0, hi ← length(arr)
WHILE lo < hi
mid ← lo + (hi - lo) / 2
IF arr[mid] ≤ target
lo ← mid + 1
ELSE
hi ← mid
RETURN lo
First/Last Occurrence
- First occurrence: Use lower_bound, then check if arr[result] == target.
- Last occurrence: Use upper_bound - 1, then check.
- Count occurrences: upper_bound - lower_bound.
Binary Search on Answer
Apply binary search to the answer space rather than an array. Works when the predicate is monotonic (false for small values, true for large values — or vice versa).
Example: Find the smallest x such that f(x) ≥ target, where f is monotonically increasing.
FUNCTION BINARY_SEARCH_ON_ANSWER(lo, hi, PREDICATE)
WHILE lo < hi
mid ← lo + (hi - lo) / 2
IF PREDICATE(mid) = true
hi ← mid
ELSE
lo ← mid + 1
RETURN lo
Applications: Minimize maximum (minimax problems), capacity/threshold problems, square root computation, koko eating bananas, etc.
Common Pitfalls
- Integer overflow: Use
lo + (hi - lo) / 2instead of(lo + hi) / 2. - Off-by-one errors: Use half-open intervals [lo, hi) consistently. Carefully choose
lo = mid + 1vslo = mid. - Infinite loops: Ensure search space shrinks each iteration. When lo + 1 == hi with inclusive bounds, mid == lo → if lo = mid, no progress.
Interpolation Search
Estimate the position based on the value (like looking up a name in a phone book — start near the right letter).
pos = lo + ((target - arr[lo]) × (hi - lo)) / (arr[hi] - arr[lo])
Time: O(log log n) for uniformly distributed data. O(n) worst case (non-uniform).
When to use: Uniformly distributed sorted data. Otherwise, binary search is safer.
Exponential Search
Find the range containing the target, then binary search within it.
FUNCTION EXPONENTIAL_SEARCH(arr, target)
n ← length(arr)
IF n = 0 THEN RETURN NOT_FOUND
IF arr[0] = target THEN RETURN 0
bound ← 1
WHILE bound < n AND arr[bound] < target
bound ← bound * 2
// Binary search in [bound/2, min(bound, n-1)]
lo ← bound / 2
hi ← MIN(bound, n - 1)
RETURN BINARY_SEARCH(arr[lo..hi], target)
Time: O(log i) where i is the position of the target. Better than binary search when the target is near the beginning.
Use case: Unbounded or semi-infinite sorted lists. When the target is likely near the start.
Ternary Search
For unimodal functions (one peak or valley). Finds the maximum (or minimum) of f(x) on [lo, hi].
FUNCTION TERNARY_SEARCH_MAX(lo, hi, f)
REPEAT 200 TIMES // sufficient iterations for precision
m1 ← lo + (hi - lo) / 3
m2 ← hi - (hi - lo) / 3
IF f(m1) < f(m2)
lo ← m1
ELSE
hi ← m2
RETURN (lo + hi) / 2
Time: O(log n) (reduces range by 1/3 each iteration). Converges slower than binary search on monotonic functions, but works on unimodal functions.
Alternative: For differentiable functions, binary search on the derivative (find where f'(x) = 0) is usually better.
Search in Rotated Sorted Array
A sorted array rotated at an unknown pivot: [4, 5, 6, 7, 0, 1, 2].
Modified binary search: At each step, determine which half is sorted and whether the target is in that half.
FUNCTION SEARCH_ROTATED(arr, target)
lo ← 0, hi ← length(arr) - 1
WHILE lo ≤ hi
mid ← lo + (hi - lo) / 2
IF arr[mid] = target THEN RETURN mid
IF arr[lo] ≤ arr[mid] // left half is sorted
IF arr[lo] ≤ target AND target < arr[mid]
hi ← mid - 1
ELSE
lo ← mid + 1
ELSE // right half is sorted
IF arr[mid] < target AND target ≤ arr[hi]
lo ← mid + 1
ELSE
hi ← mid - 1
RETURN NOT_FOUND
Time: O(log n).
Two Pointers Technique
Use two pointers (indices) to scan the array, typically from both ends or both from the start.
Two Sum (Sorted Array)
Find two elements summing to target:
FUNCTION TWO_SUM_SORTED(arr, target)
lo ← 0, hi ← length(arr) - 1
WHILE lo < hi
sum ← arr[lo] + arr[hi]
IF sum = target THEN RETURN (lo, hi)
IF sum < target
lo ← lo + 1
ELSE
hi ← hi - 1
RETURN NOT_FOUND
Time: O(n). Much better than brute-force O(n²).
Remove Duplicates
Maintain a "write" pointer and a "read" pointer. Write pointer advances only for unique elements.
Container With Most Water
Two pointers from both ends. Move the shorter side inward (since moving the taller side can't increase area).
Sliding Window
Maintain a window [left, right] over the array. Expand right to include more, shrink left to satisfy constraints.
Maximum Sum Subarray of Size k
FUNCTION MAX_SUM_K(arr, k)
window_sum ← SUM(arr[0..k])
max_sum ← window_sum
FOR i ← k TO length(arr) - 1
window_sum ← window_sum + arr[i] - arr[i - k]
max_sum ← MAX(max_sum, window_sum)
RETURN max_sum
Variable-Length Sliding Window
Expand right until condition violated, then shrink left until condition restored.
Example: Longest substring without repeating characters. Minimum window substring.
Time: O(n) — each element is added and removed at most once.
Applications in CS
- Database indexing: B-tree search is a disk-optimized binary search. Range scans use lower/upper bound.
- System calls:
bsearchin C standard library. Binary search in sorted arrays for configuration lookup. - Debugging:
git bisectis binary search on commit history to find the commit that introduced a bug. - Networking: Binary search for MTU path discovery. IP routing (longest prefix match).
- Algorithms: Binary search on answer is a fundamental technique for optimization problems (minimize/maximize subject to constraint).
- Machine learning: Hyperparameter tuning (though Bayesian optimization is usually better). Feature threshold selection in decision trees.
- Compression: Huffman/arithmetic coding uses search in cumulative frequency tables.