Introduction
When discussing sorting algorithms, two key concepts often come up: inversions and stability. Let’s break these down to better understand how they influence the way we sort data.
What Are Inversions?
An inversion is a pair of elements in a sequence that are out of order relative to a given ordering relation. Think of it as a measure of how "unsorted" a list is. For example, consider the following list of strings, where the ordering relation is based on string length:
- [“are”, “we”, “sorting”, “hello”, “world”, “learning”]
This list is clearly not sorted by string length. To quantify how "out of order" it is, we can count the inversions.
In this case, the inversions are:
- (“are”, “we”)
- (“sorting”, “hello”)
- (“sorting”, “world”)
Each inversion represents a pair of elements that are in the wrong order. The more inversions a list has, the further it is from being sorted. In fact, sorting can be thought of as the process of reducing the number of inversions in a sequence to zero.
Stability in Sorting Algorithms
Another important concept in sorting is stability. A sorting algorithm is considered stable if it preserves the relative order of equal elements. Let’s revisit our string example with the original sequence:
- [“hello”, “world”, “we”, “are”, “learning”, “sorting”]
When sorted by string length, there are two valid outcomes:
Stable Sort:
- [“we”, “are”, “hello”, “world”, “sorting”, “learning”]
Here, the equal-length strings “hello” and “world” retain their original relative order.
Unstable Sort:
- [“we”, “are”, “world”, “hello”, “sorting”, “learning”]
In this case, the relative order of “hello” and “world” has been reversed. The first outcome is considered stable because it maintains the original order of equal elements, while the second is unstable.
Why Do These Concepts Matter?
Inversions help us quantify how far a list is from being sorted, which can be useful for analyzing the efficiency of sorting algorithms. Stability is crucial in scenarios where the original order of equal elements carries meaningful information. For example, when sorting a list of transactions by date, you might want to preserve the order of transactions that occurred on the same day. By understanding inversions and stability, we gain deeper insight into how sorting algorithms work and how to choose the right one for a given task.
Here are sorting algorithms we will be going to go through.
- ✅ Selection Sort
- ✅ Bubble Sort
- ✅ Insertion Sort
- ✅ Heap Sort
Selection Sort
Selection sort is a simple and highly intuitive algorithm that is relatively easy to implement. However, it is quite slow, requiring O(n²) time to sort a list in the worst case. This is because, in the worst-case scenario, the algorithm must scan the entire array to find the smallest element, leading to n + (n - 1) + (n - 2) + ... + 1 operations, which results in O(n²) complexity.
One drawback of selection sort is that it is not a stable sorting algorithm. For instance, consider the list [C, A, B, C, D]. After the first pass, the array becomes [A, B, C, C, D], which is sorted, but the relative order of the two C has changed.
Below is an implementation of the selection sort algorithm:
public class Solution {
public void selectionSort(int[] arr) {
// Mutates arr so that it is sorted via selecting the minimum element and
// swapping it with the corresponding index
int min_index;
for (int i = 0; i < arr.length; i++) {
min_index = i;
for (int j = i + 1; j < arr.length; j++) {
if (arr[j] < arr[min_index]) {
min_index = j;
}
}
// Swap current index with minimum element in rest of list
int temp = arr[min_index];
arr[min_index] = arr[i];
arr[i] = temp;
}
}
}
Bubble Sort
Bubble sort works by comparing two adjacent elements at a time. If the left element is greater than the right element, they are swapped. The algorithm then moves to the next pair of adjacent elements and repeats the process. The key idea behind bubble sort is that it continues this process until a full pass is completed without any swaps, indicating that the list is fully sorted.
The runtime of bubble sort depends on the number of passes required to sort the array. Given an array of n elements, each pass examines (n - 1) pairs. In the worst case where the smallest element is at the end of the list, it takes (n - 1) passes to move it to the front, plus one additional pass to confirm that no further swaps are needed. Consequently, the worst-case time complexity of bubble sort is O(n²).
A key advantage of bubble sort is that it is a stable sorting algorithm, meaning that elements with equal values retain their relative order after sorting.
While bubble sort is easy to implement and stable, it is generally inefficient and impractical for large datasets due to its slow performance. As a result, it is rarely used in real-world applications.
Below is an implementation of the bubble sort algorithm:
public class Solution {
public void bubbleSort(int[] arr) {
// Mutates arr so that it is sorted via swapping adjacent elements until
// the arr is sorted.
boolean hasSwapped = true;
while (hasSwapped) {
hasSwapped = false;
for (int i = 0; i < arr.length - 1; i++) {
if (arr[i] > arr[i + 1]) {
// Swap adjacent elements
int temp = arr[i];
arr[i] = arr[i + 1];
arr[i + 1] = temp;
hasSwapped = true;
}
}
}
}
}
Insertion Sort
Insertion sort is a simple sorting algorithm that builds the final sorted list one element at a time by repeatedly inserting the current element into its correct position in the already sorted portion of the list.
In terms of efficiency, the worst-case scenario for this approach occurs when the list is reversed, requiring each element to be inserted at the beginning of the list. This results in a total of 1 + 2 + … + (n - 1) or O(n²) swaps. The space complexity of insertion sort is O(1), as all operations are carried out in-place.
Despite its O(n²) time complexity, insertion sort offers some advantages in practice. Firstly, it is a stable sort, meaning that equal elements will not be swapped in a way that changes their relative order. More importantly, there are situations where insertion sort may actually be the most efficient sorting method.
For example, when working with nearly sorted arrays that have relatively few inversions compared to the total size, insertion sort can perform very well since fewer swaps are needed. Additionally, insertion sort is often the best choice for sorting small arrays. Java’s Arrays.sort() implementation follows this approach before opting for more theoretically efficient sorting algorithms.
However, for larger datasets with many inversions, other sorting algorithms tend to outperform insertion sort. Despite this, among the sorting methods we've discussed, insertion sort remains one of the few that is commonly used, depending on the specific context.
Here is the implementation of insertion sort:
public class Solution {
public void insertionSort(int[] arr) {
// Mutates elements in arr by inserting out of place elements into appropriate
// index repeatedly until arr is sorted
for (int i = 1; i < arr.length; i++) {
int currentIndex = i;
while (currentIndex > 0 && arr[currentIndex - 1] > arr[currentIndex]) {
// Swap elements that are out of order
int temp = arr[currentIndex];
arr[currentIndex] = arr[currentIndex - 1];
arr[currentIndex - 1] = temp;
currentIndex -= 1;
}
}
}
}
Heap Sort
Heap Sort is a sorting algorithm that uses a special data structure called a heap to efficiently sort an array. The main idea behind Heap Sort is to first organize the input array into a max-heap and then repeatedly extract the largest element, placing it in its correct position in the sorted array.
How Heap Sort Works
- Building a Max-Heap
Think of the array as a binary tree where:
A parent at index i has its left child at 2i + 1 and right child at 2i + 2 (assuming 0-based indexing).
Start from the last non-leaf node and heapify (adjust elements so the largest number is always at the top).
This ensures that the largest number is always at the root.
- Sorting Using the Heap
Swap the largest element (root) with the last element of the heap. Reduce the heap size (ignoring the last element, as it is now sorted). Re-heapify the remaining elements so the next largest number moves to the top. Repeat this process until all elements are sorted.
Why Heap Sort is Efficient
Time Complexity:
Building the heap: O(N)
Extracting elements: O(log N) per extraction, repeated N times → O(N log N) overall.
Total Time Complexity: O(N log N)
Space Complexity:
Heap Sort works in-place, meaning it doesn’t require extra memory.
Space Complexity = O(1)
Key Advantages of Heap Sort
More efficient than Selection Sort (which is O(N²)) because it uses a heap to efficiently find the maximum element. Works well for large datasets since it guarantees O(N log N) performance in all cases . Does not require extra memory, unlike Merge Sort.
Here is the implementation of heap sort:
public class Solution {
public void heapSort(int[] arr) {
// Step 1: Build a max heap
// Time Complexity: O(N) - Heap construction is an O(N) operation
for (int i = arr.length / 2 - 1; i >= 0; i--) {
maxHeapify(arr, arr.length, i);
}
// Step 2: Extract elements from the heap one by one
// Outer loop runs N times, each maxHeapify call takes O(log N), so overall O(N log N)
for (int i = arr.length - 1; i > 0; i--) {
// Swap the root (max element) with the last element
// O(1) - Constant time operation
int temp = arr[i];
arr[i] = arr[0];
arr[0] = temp;
// Heapify the root element to maintain max-heap property
// O(log N) - Heapify operation per iteration
maxHeapify(arr, i, 0);
}
}
private void maxHeapify(int[] arr, int heapSize, int index) {
int left = 2 * index + 1; // O(1) - Calculate left child index
int right = 2 * index + 2; // O(1) - Calculate right child index
int largest = index; // O(1) - Assume current index is largest
// Check if left child is larger
// O(1) - Comparison
if (left < heapSize && arr[left] > arr[largest]) {
largest = left;
}
// Check if right child is larger
// O(1) - Comparison
if (right < heapSize && arr[right] > arr[largest]) {
largest = right;
}
// If largest is not the current node, swap and recursively heapify
// Swap is O(1), recursive call takes O(log N) in worst case
if (largest != index) {
int temp = arr[index];
arr[index] = arr[largest];
arr[largest] = temp;
maxHeapify(arr, heapSize, largest); // Recursive call - O(log N)
}
}
}