Binary Search
Intro
Binary Search finds a target in a sorted array by repeatedly cutting the search range in half. This gives logarithmic time complexity and predictable performance for large inputs. Use it when the data is sorted and random access is cheap. Example: finding a user id in a sorted list of millions of numeric ids.
Deeper Explanation
- Keep two boundaries
leftandrightand inspect the middle index each loop. - If
a[mid]is less than target, movelefttomid + 1; ifa[mid]is greater than target, moverighttomid - 1; if equal, returnmid. - Use
mid = left + (right - left) / 2to avoid integer overflow in fixed-width integer languages. - Complexity:
O(log n)time,O(1)extra space for iterative implementation.
Example
public static int BinarySearch(int[] arr, int target)
{
int left = 0;
int right = arr.Length - 1;
while (left <= right)
{
int mid = left + (right - left) / 2;
if (arr[mid] == target)
{
return mid;
}
if (arr[mid] < target)
{
left = mid + 1;
}
else
{
right = mid - 1;
}
}
return -1;
}
Diagram
flowchart TD
A[Start with sorted array and target] --> B[Set left and right]
B --> C{left less or equal right}
C -->|No| Z[Not found]
C -->|Yes| D[Compute mid safely]
D --> E{value at mid equals target}
E -->|Yes| F[Return mid]
E -->|No| G{value at mid less than target}
G -->|Yes| H[Move left to mid plus one]
G -->|No| I[Move right to mid minus one]
H --> C
I --> CPitfalls
- Unsorted input breaks correctness because binary search relies on monotonic ordering; sort first or use a different strategy.
- Off by one boundary mistakes can produce infinite loops; keep loop condition and boundary updates consistent (
left <= right). - Duplicate values require a variant when you need first or last occurrence, not just any match.
Tradeoffs
- Binary Search vs Linear Search: binary search is much faster on large sorted arrays, but linear search works on unsorted data and tiny lists without sort cost.
- Binary Search vs Hash lookup: hash lookup is average
O(1)but needs extra memory and hash maintenance; binary search works in-place on sorted arrays and preserves order for range queries.
Questions
Why does binary search require sorted data?
- The half-split decision depends on monotonic ordering.
- Without sorting,
a[mid] < targetgives no guarantee about where the target can be. - Binary search can skip over the target on unsorted input and return false negatives.
- Why it matters: correctness depends on this precondition, so production code should assert or document it.
How do you find the first occurrence of a duplicated value?
- On equality, store
midas a candidate answer. - Continue searching the left half by setting
right = mid - 1. - Keep the same loop condition and safe midpoint calculation.
- Return the stored candidate after the loop ends.
- Why it matters: this lower-bound variant is a common interview and production requirement.
Why is
left + (right - left) / 2 preferred over (left + right) / 2?
- In fixed-width integer languages,
left + rightcan overflow for large index values. left + (right - left) / 2computes the same midpoint without overflow risk.- The same safety principle applies across many divide-and-conquer algorithms.
- Why it matters: this prevents rare but high-impact correctness bugs at scale.
Links
- Binary search (cp algorithms)
- Array BinarySearch method .NET
- Nearly all binary searches and mergesorts are broken
Whats next