Swift's Sequence Inside The Compiler: How for
loops work internally
One of my favorite things about Swift is that almost every compiler feature is built on top of actual classes and protocols of the language. This means that if you see a native type that has some special magical property (like SwiftUI's declarative syntax, which was covered in another post here in SwiftRocks), it's likely that you can reproduce that behavior in your custom types.
In this article, I'll cover one of my favorite syntax sugars in the language: the protocols and the internal compiler behavior that allow you to write for
loops. You might have heard already of the Sequence
family of protocols, and here, we'll see how the compiler uses them as the building blocks for for
loops.
for i in 0..<10 {
print("Wait, this is not a hardcoded language thing?")
}
In most languages, the ability to iterate simple data types like integers in for
statements is often hardcoded into the compiler. This was no different in the first few versions of Swift where the standard was to use old C-style loops:
// Swift 1.2
for var i = 0; i < 10; i++ {
print(i)
}
for in
style loops came along with the Range
type, and the release of Swift 3 officially removed the old C-style loops from the language and forced everyone to use the for in
loops that we are all used to. Now, for
loops in Swift are based on actual protocols that you can inherit to add for in
capabilities to anything you create that can be represented as such. As expected, this means that the ability to use ranges as the parameter of a loop isn't because ranges are treated differently in the compiler -- but because ranges conform to specific public protocols.
IteratorProtocol
The base of how iteration in general works in Swift is the IteratorProtocol
type -- a very simple protocol that only takes two components: An associatedtype
that represents what's being iterated, and a next
property that returns the "next" element on the iteration (or nil if it's over).
A basic example of something that can be implemented as an IteratorProtocol
is a countdown timer that keeps reducing a value until it reaches zero:
struct CountdownIterator: IteratorProtocol {
typealias Element = Int
var count: Int
mutating func next() -> Int? {
if count == 0 {
return nil
} else {
self.count -= 1
return count
}
}
init(count: Int) {
self.count = count
}
}
If we access this iterator continuously we'll get diminishing values until the iteration ends, represented by nil:
var iterator = CountdownIterator(count: 3)
iterator.next() // 2
iterator.next() // 1
iterator.next() // 0
iterator.next() // nil
iterator.next() // nil
Iterators are said to be destructive because you can't access a specific value more than once -- you need to go from the beginning to the end in a linear fashion and you can only do it. If you need to access a value again, you need to create another iterator and repeat the process.
Sequence
In terms of iteration, the Sequence
protocol is what we interact with the most. In short, a Sequence
is simply a more high level abstraction of a IteratorProtocol
type that provides additional functionality on top of the regular iteration process. While Swift allows you to create Sequences
that are also IteratorProtocols
, the existence of this protocol allows you to separate the definition of the type itself from the logic that handles its iteration. To use it, all you have to do is create a makeIterator()
type that returns an IteratorProtocol
:
struct Countdown: Sequence {
typealias Iterator = CountdownIterator
let count: Int
init(count: Int) {
self.count = count
}
func makeIterator() -> CountdownIterator {
return CountdownIterator(count: count)
}
}
The implementation requirements of the Sequence
protocol might make it look useless, but don't judge a book by its cover -- it actually contains tons of required methods, but they are all implemented for you in the standard library. Sequence
contains pre-definitions for lots of the commonly used sequence-based algorithms, including map
, filter
and reduce
!
Even though IteratorProtocol
is the type that defines how the iteration happens, Sequence
is the type that provides the additional properties and methods that make these iterations useful in the first place:
let countdown = Countdown(count: 3)
countdown.map { $0 * $0 }
countdown.allSatisfy { $0.isMultiple(of: 2) }
countdown.max()
countdown.sorted()
countdown.forEach { print($0) }
Just like in IteratorProtocol
, we can create an iterator and linearly retrieve values from it. This is how all these properties and methods work -- for example, this is how we could implement a clone of map()
:
extension Sequence {
func map<T>(_ transform: (Iterator.Element) throws -> T) rethrows -> [T] {
var iterator = makeIterator()
var result = [T]()
while let next = iterator.next() {
result.append(try transform(next))
}
return result
}
}
But this isn't all -- the greatest property of Sequences
and the reason this article was conceived is that Sequences
are deeply tied to the Swift compiler in the form of for loops. Like mentioned before, the reason you can use ranges in loops is not because ranges are special, but simply because ranges are Sequences
. Because our example Countdown
type is a Sequence
, it can also be used inside a for
loop!
for duration in Countdown(count: 3) {
print(duration)
}
As one would expect, this will work with any type that conforms to Sequence
.
How for loops work internally in Swift
Interestingly, this behavior isn't new to iOS -- old-timers may remember that you could unlock for-each style loops in Objective-C in a similar fashion by making objects inherit from NSFastEnumerator
:
for (id object in objectThatConformsToFastEnumerator) {
NSLog(@"%@", object);
}
But unlike back then, with Swift we can actually see what's happening under the hood. for
exists inside the compiler for the purpose of getting your code to be parsed correctly, but the final code is actually different. To inspect this, what I like to do is to compile an example file and have Swift dump its final Swift Intermediate Language representation:
swiftc -emit-sil forLoopTest.swift
If I run this command with an example forLoopTest.swift
file that contains an for
loop for an array such as the one below, we'll see things like this around the area where the loop was supposed to be defined:
let array = [1,2,3]
for num in array {
print(num)
}
// function_ref Collection<>.makeIterator()
%40 = function_ref ...
// function_ref IndexingIterator.next()
%40 = function_ref ...
switch_enum %43 : $Optional<Int>, case #some!enumelt.1: bb3, case #none!enumelt: bb2
This shows us that for
loops don't really exist -- they are simply a form of syntax sugar to allows you to effortlessly iterate an IteratorProtocol
without having to actually extract the iterator!
We can get more details by seeing Swift's source code, more specifically the classes that handle the semantic analysis of the code. In this case, the magic happens after the compiler type-checks the loop: If we analyze the code from the type-checker, we can see that if the loop represents valid code that iterates a Sequence
, it injects a new property called ${loopName}$generator
that fetches the sequence's iterator, then fetches the iterator's Element
type and then finally moves our original closure into a new loop that loops the iterator's next()
property.
For example, if we compile the previous loop example, the final result will be something comparable (but not exactly) to this:
let array = [1,2,3]
var $num$generator = array.makeIterator()
while let num = $num$generator.next() {
print(num)
}
The Swift compiler uses the $
prefix for internal compiler synthesized properties. Because it needs to inject actual code, it uses a symbol that is impossible to be used in regular code to make sure we won't have namespacing issues. Interestingly, you can actually interact with some of these internal properties as part of the new property wrappers feature.
What about Collections?
When studying more about Swift's collection types, IteratorProtocol
, Sequence
and Collection
are often mentioned together, and you might have noticed that the generated SIL for our example mentions Collections
as it uses an array. If the syntax sugar is linked to Sequences
only, how do collections work?
We can say that a Collection
is simply a more jacked-up version of a Sequence
. In fact, all Collections
are also Sequences
:
public protocol Collection: Sequence { ... }
Unlike regular Sequences
, Collections
allow you to access elements multiple times, iterate them in reverse order (in the case of BidirectionalCollections
) and even access specific elements directly through subscripts (in the case of RandomAccessCollections
like Array<>
)
Unlike Sequences
, Collections
do not need to provide an IteratorProtocol
. By default, Swift provides the IndexingIterator
struct: an IteratorProtocol
that takes a collection and simply traverses the indexes that are defined as part of the Collection
protocol. As all Collections
are Sequences
, they can also benefit from having syntax sugar loops.
---