Timsort and Introsort: Swift's Sorting Algorithms

Published on 04 Sep 2019

Have you ever asked yourself which algorithm is used by Swift's sorting method? There are many sorting algorithms out there, and chances are that you'll rarely have to use something other than the language's builtin sort() method. However, knowing the properties of the sorting algorithm built into your language is important if you want to prevent unwanted behaviors and nasty edge cases.

When analyzing sorting algorithms, you'll want to search for two properties:

1 - Sorting Stability

The stability of a sorting algorithm represents the ability of the algorithm to maintain the original order of equal elements after sorting. An unstable sorting algorithm has no guarantees that the order of equal elements in the unsorted array will stay the same after sorting, while a stable one guarantees that they will stay the same.

This might sound weird, after all, if the elements are the same, why should I care about their overall order? This can be true if you're sorting elements by value, but when sorting elements by some arbitrary priority, using unstable algorithms can give you undesired results.

Let's assume that we're building a music player, and our current task is to sort songs based on their popularity:

struct Music: Comparable, Equatable, CustomStringConvertible {
    let name: String
    let popularityValue: Int

    static func < (lhs: Music, rhs: Music) -> Bool {
        return lhs.popularityValue < rhs.popularityValue
    }

    var description: String {
        return name
    }
}

var songs: [Music] = [
    Music(name: "I'm that swifty", popularityValue: 3),
    Music(name: "Swift boi", popularityValue: 5),
    Music(name: "Swift That Walk", popularityValue: 1),
    Music(name: "Too Swift", popularityValue: 5),
]

If we sort songs using Quicksort, we'll get the following result:

extension Array where Element: Equatable & Comparable {
    func quicksort(comparison: ((Element, Element) -> Bool)) -> [Element] {
        var copy = self
        copy.quick(0, count - 1, comparison: comparison)
        return copy
    }

    mutating private func quick(_ i: Int, _ j: Int, comparison: ((Element, Element) -> Bool)) {
        guard i < j else {
            return
        }
        let pivot = partition(i, j, comparison: comparison)
        quick(i, pivot - 1, comparison: comparison)
        quick(pivot + 1, j, comparison: comparison)
    }

    mutating private func partition(_ i: Int, _ j: Int, comparison: ((Element, Element) -> Bool)) -> Int {
        let pivotElement = self[j]
        var indexToAdd = i - 1
        for k in i..<j {
            if comparison(self[k], pivotElement) {
                indexToAdd += 1
                swapAt(indexToAdd, k)
            }
        }
        swapAt(indexToAdd + 1, j)
        return indexToAdd + 1
    }
}

songs = songs.quicksort {
    $0.popularityValue > $1.popularityValue
}
print(songs)

// [Too Swift, Swift boi, I'm that swifty, Swift That Walk]

Although "Swift boi" was placed before "Too Swift" in the original array, Quicksort changed their positions!

That's not too bad though as we never actually used the unsorted version of the array. However, consider what happens if we re-sort the array multiple times:

songs = songs.quicksort {
    $0.popularityValue > $1.popularityValue
}
print(songs)
songs = songs.quicksort {
    $0.popularityValue > $1.popularityValue
}
print(songs)
songs = songs.quicksort {
    $0.popularityValue > $1.popularityValue
}
print(songs)
songs = songs.quicksort {
    $0.popularityValue > $1.popularityValue
}
print(songs)
songs = songs.quicksort {
    $0.popularityValue > $1.popularityValue
}
print(songs)

// [Too Swift, Swift boi, I'm that swifty, Swift That Walk]
// [Swift boi, Too Swift, I'm that swifty, Swift That Walk]
// [Too Swift, Swift boi, I'm that swifty, Swift That Walk]
// [Swift boi, Too Swift, I'm that swifty, Swift That Walk]
// [Too Swift, Swift boi, I'm that swifty, Swift That Walk]

Their relative order keeps changing!

The reason is because Quicksort is an unstable sorting algorithm. If for some reason we needed to continuously update this list in the UI, the user would see songs changing positions in the ranking even though they have the same priority. That's not very good.

To keep their order, we need to use a stable algorithm like Mergesort.

extension Array where Element: Equatable & Comparable {
    func mergesort(comparison: ((Element, Element) -> Bool)) -> [Element] {
        return merge(0, count - 1, comparison: comparison)
    }

    private func merge(_ i: Int, _ j: Int, comparison: ((Element, Element) -> Bool)) -> [Element] {
        guard i <= j else {
            return []
        }
        guard i != j else {
            return [self[i]]
        }
        let half = i + (j - i) / 2
        let left = merge(i, half, comparison: comparison)
        let right = merge(half + 1, j, comparison: comparison)
        var i1 = 0
        var i2 = 0
        var new = [Element]()
        new.reserveCapacity(left.count + right.count)
        while i1 < left.count && i2 < right.count {
            if comparison(right[i2], left[i1]) {
                new.append(right[i2])
                i2 += 1
            } else {
                new.append(left[i1])
                i1 += 1
            }
        }
        while i1 < left.count {
            new.append(left[i1])
            i1 += 1
        }
        while i2 < right.count {
            new.append(right[i2])
            i2 += 1
        }
        return new
    }
}

// [Swift boi, Too Swift, I'm that swifty, Swift That Walk]
// [Swift boi, Too Swift, I'm that swifty, Swift That Walk]
// [Swift boi, Too Swift, I'm that swifty, Swift That Walk]
// [Swift boi, Too Swift, I'm that swifty, Swift That Walk]
// [Swift boi, Too Swift, I'm that swifty, Swift That Walk]

2 - Time/Space Complexity

The second important thing to be aware of is how much additional memory the algorithm takes to run and what are best/worst cases for the algorithm.

My favorite example of this is Counting Sort: where an array is sorted by simply counting the occurrences of each element number of elements and then laying them out in order. If the difference between each value is small, say [3,1,4,2,5], this algorithm can sort arrays in runtimes very close to O(n) -- but if the difference is big, like [1,1000000000], Counting Sort will take an enormous amount of time to run even if the array is small.

Likewise, the famous Quick Sort is highly regarded for being a fast and in-place O(n log n) algorithm in average, but it has a terrible worst case of O(n2) if the pivot is always the highest/smallest element in the partition. If you're dealing with large amounts of data or a constrained environment, there will be a specific sorting algorithm that best fits your needs.

Pre-Swift 5 Algorithm: Introsort

Before Swift 5, Swift's sorting algorithm was a hybrid algorithm called Introsort, which mixes the strengths of Quicksort, Heapsort and Insertion Sort into a single algorithm to guarantee a worse case of O(n log n).

The idea behind Introsort is straightforward: First, if you have less than 20 elements in the partition being sorted, Insertion Sort is used. Although this algorithm has a worst case of O(n2), it also has a best case of O(n). Compared to the general O(n log n) algorithms, Insertion Sort will always perform better in small inputs.

If the array is not small, Quicksort will be used. This will bring our best case to O(n log n), but also maintain the worst case of O(n2). However, Introsort can avoid it -- if the recursion tree for Quicksort gets too deep, the partition switches to Heapsort. In this case, "too deep" is considered as 2 * floor(log2(array.count)).

internal mutating func _introSortImpl(within range: Range<Index>,
                                      by areInIncreasingOrder: (Element, Element) throws -> Bool,
                                      depthLimit: Int) rethrows {
    // Insertion sort is better at handling smaller regions.
    if distance(from: range.lowerBound, to: range.upperBound) < 20 {
        try _insertionSort(within: range, by: areInIncreasingOrder)
    } else if depthLimit == 0 {
        try _heapSort(within: range, by: areInIncreasingOrder)
    } else {
        // Partition and sort.
        // We don't check the depthLimit variable for underflow because this
        // variable is always greater than zero (see check above).
        let partIdx = try _partition(within: range, by: areInIncreasingOrder)
        try _introSortImpl(
            within: range.lowerBound..<partIdx,
            by: areInIncreasingOrder,
            depthLimit: depthLimit &- 1)
        try _introSortImpl(
            within: partIdx..<range.upperBound,
            by: areInIncreasingOrder,
            depthLimit: depthLimit &- 1)
    }
}

Introsort greedily attempts to pick the best algorithm for the given situation, backing up to a different option when the previous choice goes wrong. It has a best case of O(n) and worst case of O(n log n), making it a decent one-size-fits-all algorithm.

In terms of memory usage, it will perform slightly worse than the usual sorting algorithm. Although the three algorithms can sort inplace, Introsort's implementation in Swift was recursive. Since Swift doesn't guarantee that tail recursion calls will be optimized, running the builtin sort() pre-Swift 5 was not the best option if keeping memory usage low was important.

The biggest thing to note is that Introsort was unstable. Although Insertion Sort is stable, the default implementation of Quicksort and Heapsort are not. If the order of equal elements was important, using sort() pre-Swift 5 was also not a good idea.

After Swift 5 - Timsort

Multiple threads surfaced between 2015 and 2018 about adding a stable sorting algorithm to Swift that didn't rely on recursion, but promising discussions first showed up only in the beginning of 2018. In October 2018, a pull request was finally merged to change Introsort to a modified version of Timsort.

Timsort is a hybrid algorithm like Introsort, but with different approaches. It works by dividing an array into smaller sections, sorting these smaller sections with Insertion Sort, and then merging these sections together with Merge Sort. Because both Insertion Sort and Mergesort are stable, Timsort is stable, and while it also has a worst case of O(n log n) and non-constant space complexity, it tends to be considerably quicker than the more naive algorithms in real world scenarios. The reason why Timsort can be so fast is that although the description sounds simple enough, in reality each of these steps are highly tuned for efficiency.

Finding the next "run" (Partitioning)

Instead of dividing everything first and merging as the last step like Mergesort, Timsort scans the array once and progressively merge these regions (called "runs") as they are found.

The beauty is that unlike Mergesort, a run isn't simply the array divided by half. Timsort abuses the fact that every array that you want to sort is likely to have a few contiguous subsequences that are almost or already sorted, which happens to be Insertion Sort's best case. To find the next run, Timsort will advance a pointer until the current sequence stops being an ascending/descending pattern:

[1, 2, 3, 1, 9, 6]
 i     j

Note: This example is just for visual purposes -- because the array here is small, Timsort would just Insertion Sort it right away.

The range from i to j defines our first run run, but the optimizations don't stop here.

First: If the sequence is descending, we can already sort it in linear time by reversing the elements.

Second: To increase the speed of Insertion Sort and to balance the amount of merge operations that will be done later, Timsort defines that every run should have a minimum size of a power of two between 16 and 128, or at least something very close to it. If the run that we found has a size smaller than the minimum run size, then the current run's range is expanded and sorted with Insertion Sort.

// Find the next consecutive run, reversing it if necessary.
var (end, descending) = try _findNextRun(in: self, from: start, by: areInIncreasingOrder)
if descending {
    _reverse(within: start..<end)
}

// If the current run is shorter than the minimum length, use the
// insertion sort to extend it.
if end < endIndex && end - start < minimumRunLength {
    let newEnd = Swift.min(endIndex, start + minimumRunLength)
    try _insertionSort(within: start..<newEnd, sortedEnd: end, by: areInIncreasingOrder)
    end = newEnd
}

For the actual size, Swift specifically will pick a value between 32 and 64 that varies according to the size of the array. Finally, after a run is found, it's added to a stack containing all the other previous runs that we've found.

Merging runs

Every time a run is found, Timsort will attempt to collapse the top three runs of the stack into a single one by merging them together until the following conditions are satisfied:

runs.count < 3 || runs[runs.count - 2].size > (runs[runs.count - 1].size + runs.last!.count)
&&
runs.count < 2 || runs[runs.count - 1].size > runs.last!.size

This is done to reduce the amount of runs we have to remember, but mainly to balance the overall size of the runs as having them close together is beneficial for Timsort's merging phase.

At first glance Timsort's merging phase works just like Mergesort: we compare a pair of elements from each array, picking the smallest one and moving it to its proper position in the final array.

However, Timsort beautifully empowers the fact that if one specific array keeps "winning" the comparison, then it's likely to keep doing so. If this happens, instead of continuing to compare elements, we can simply binary search for the correct position of the element from the "losing" array in the "winning" one, moving all elements before it to the final array. This is called galloping and it saves us a lot of time by letting us skip comparing an entire chunk of the winning array.

The galloping process can be aborted if the binary search process "loses" to the regular comparison process. You can see all the optimizations that are done in the Timsort description. Finally, after all runs were found, the remaining runs in the stack are progressively merged together until the entire array is sorted.

The major differences in Swift is that the compiler's implementation of Timsort doesn't use galloping, and it attempts to collapse runs based on the last four runs instead of the last three. Still, it outperforms Introsort in pretty much every scenario.

Conclusion

Timsort is one of the fastest sorting algorithms for real world problems. Knowing what algorithm your language uses can help you optimize your code by making better decisions based on what data you're handling.

Follow me on my Twitter (@rockbruno_), and let me know of any suggestions and corrections you want to share.

References and Good reads

Stable Sort Proposal
Stable Sort PR
Introsort
Timsort
Timsort (Wiki)

Home / See all posts