Heap Data Structure
Heaps are data structures built to handle priority-based operations efficiently. Their core strength is finding and removing the highest (or lowest) priority element in time, which makes them the go-to choice for priority queues, scheduling systems, and sorting.
This section covers how to build heaps from arrays, how to use them for finding k-th elements and sorting, and when heaps are (and aren't) the right tool.
Building Heaps from Arrays
There are two distinct approaches to building a heap, and they have different time complexities. This is a common exam topic, so make sure you understand the difference.
Approach 1: Bottom-up heapify (the faster way)
The heapify procedure converts an existing array into a heap in-place. It works by treating the array as a complete binary tree and fixing heap violations from the bottom up.
-
Start at the last non-leaf node (index in a zero-indexed array).
-
For each node, compare it with its children and swap with the larger child (max-heap) or smaller child (min-heap) if the heap property is violated.
-
After a swap, continue sifting down the swapped node until it lands in a valid position.
-
Move to the next node toward the root and repeat.
The time complexity is , not . This surprises a lot of people. The reason is that most nodes are near the bottom of the tree and only need to sift down a short distance. A formal proof uses the fact that the sum of heights across all nodes is .
Approach 2: Repeated insertion (the slower way)
You can also build a heap by starting with an empty heap and inserting elements one at a time. Each insertion places the new element at the next available position and then sifts it up to restore the heap property.
- Each insertion takes in the worst case.
- Over elements, this gives total.
Bottom-up heapify is . Repeated insertion is . If you're asked to "build a heap" on an exam, the bottom-up method is typically what's expected.
%22-600px-Binary_Heap_with_Array_Implementation.JPG.jpg)
Applications of Heaps
Finding the k-th largest element
A min-heap of size can find the k-th largest element in an unsorted array without fully sorting it.
- Insert the first elements into a min-heap.
- For each remaining element in the array, compare it to the root (the current minimum in the heap).
- If the element is larger than the root, remove the root and insert the new element. This keeps the heap holding the largest elements seen so far.
- After processing all elements, the root of the min-heap is the k-th largest.
Time complexity: , since each heap operation on a size- heap costs , and you process elements. When is much smaller than , this is significantly faster than sorting the whole array.
Finding the k-th smallest element works the same way but uses a max-heap of size . You replace the root whenever you encounter a smaller element. The time complexity is still .
%22-MaxHeapify.jpg)
Heaps in Sorting Algorithms
Heap Sort
Heap Sort uses a max-heap to sort an array in ascending order. It works in two phases:
-
Build a max-heap from the input array using bottom-up heapify ().
-
Extract elements one by one:
- Swap the root (the maximum) with the last element in the unsorted portion.
- Shrink the heap size by one (the swapped element is now in its final sorted position).
- Sift the new root down to restore the heap property.
-
Repeat step 2 until the heap is empty.
- Time complexity: in all cases (best, average, worst).
- Space complexity: auxiliary space since it sorts in-place.
- Heap Sort is not stable: equal elements may change their relative order.
Priority Queues
A priority queue is an abstract data type where each element has an associated priority, and you always retrieve the element with the highest (or lowest) priority first. Heaps are the standard implementation.
- Min-heap backs a minimum priority queue (lowest value = highest priority).
- Max-heap backs a maximum priority queue (highest value = highest priority).
- Insertion:
- Extract-min/max:
- Peek at min/max:
Pros and Cons of Heaps
Advantages:
- insertion and extraction of the min/max element.
- peek at the min/max.
- Heap Sort is in-place with guaranteed worst-case performance.
- Simple array-based implementation with no pointers or extra memory overhead.
Disadvantages:
- Searching for an arbitrary element is since the heap only guarantees a partial ordering (parent vs. children), not a full one.
- Updating an element's priority (decrease-key or increase-key) requires knowing the element's position. Without an auxiliary index, finding it costs .
- Heap Sort is not stable, so equal elements may be reordered.
- In practice, heaps have worse cache performance than algorithms like quicksort due to the non-sequential memory access pattern of sifting through tree levels.
When to choose a heap:
- Priority-based access is the primary operation (scheduling, event-driven simulations, Dijkstra's algorithm).
- You need guaranteed worst-case sorting and can't afford quicksort's worst case.
When something else fits better:
- If you need to search for specific elements frequently, a balanced BST gives you search.
- If stable sorting matters (preserving the order of equal elements), use Merge Sort instead of Heap Sort.
- If your data changes frequently and you need ordered traversal, a BST or balanced tree is more appropriate.