How OptionSet works inside the Swift Compiler
Swift's OptionSet
protocol is the bridged version of Objective-C's NS_OPTIONS
enum -- a handy tool that allows you to very efficiently define a combination of "options" that can be used for a specific operation.
If for example we were developing a JSON serializer, it would be nice if we allowed the user to control how certain aspects of the serialization process happened, such as if the keys are sorted or if indentation should be applied to the final JSON. In fact, this is how Foundation's
JSONSerialization
works:
try JSONSerialization.data(withJSONObject: json, options: [.prettyPrinted, .sortedKeys])
From a "list" of options (the quotes will make sense later!), the OptionSet
protocol contains a contains(_:)
method to see if a specific option is inside the "list". Having several options passed together means that all these options should be considered, and passing no options can be done through an empty array.
But despite the looks, the options
parameter is not an array but a WritingOptions
struct that conforms to the OptionSet
protocol. How the hell does it behaves like a Set
even though it is not?!
If you've been following this blog, you'll know that the answer is a bunch of Swift compiler magic. Let's see what this protocol is, why OptionSets
are preferred over a regular Set
in this case, and finally what magic is done to provide the final behavior.
Context: Objective-C's NS_OPTIONS
Before analyzing the Swift compiler code that allows a regular struct
to behave like a Set
, let's see where OptionSet
came from.
The Objective-C equivalent of OptionSet
was the NS_OPTIONS
macro, a glorified enum that allowed the same option-mixing gimmick. With it, you can define your options just like an integer enum, like these hypothetical options for browsing SwiftRocks articles:
typedef NS_OPTIONS(NSUInteger, SwiftRocksOptions) {
DarkMode = 1 << 0,
NoJavaScript = 1 << 1,
NoImages = 1 << 2,
};
However, instead of using sequential numbers like 0, 1 and 2 in a regular enum, the idea is that you would use powers of two (easier represented by the number one 1 left-shifted (<<) by some value.):
If the usage of powers of two sounds confusing, consider how a three-bit number 0 looks like in binary:
0 0 0
In contrast, this is how a three-bit number 7 looks like in binary:
1 1 1
Since we have four bits and each of these bits can be zeroes or ones, what if we treated each bit as an option?
The first bit (from the right) could be the first option (DarkMode), the second bit the second option (NoJavaScript), and the third bit the third option (NoImage). if a bit is set to one, it means that option is "selected". This means that we could represent the selection of DarkMode as the number 1 (0 0 1), the selection of NoJavaScript as the number 2 (0 1 0), the selection of NoImage as the number 4 (1 0 0), and a potential combination of NoJavascript and NoImages as the number 6 (1 1 0).
The process of creating such numbers is generally referred to as bitmasking and is a very efficient way to treat the existence/combination of arbitrary values. While using a regular Array or Set would require us to store the actual elements in memory, the bitmasked requires just the memory space of the integer itself used to process the mask.
Adding and checking options in a mask is done through bitwise operators. The bitwise OR operator (|) returns a number that has a specific bit set to one if any of the numbers in the operation do so, so since we're using powers of two, using it on two options return a number that represents their combination:
NSUInteger options = SwiftRocksOptionsNoJavaScript | SwiftRocksOptionsNoImages
// 1 1 0 (6)
On the other hand, the bitwise AND (&) operator returns a number that has a specific bit set to one only if both numbers in the operation do so. If we use it with the set of options along with the specific option that we're checking, because we're using powers of two, we'll get a value different from zero if the option exists in the set (or zero if it doesn't):
(options & SwiftRocksOptionsDarkMode) != 0 // false
(options & SwiftRocksOptionsNoJavaScript) != 0 // true
(options & SwiftRocksOptionsNoImages) != 0 // true
Option bitmasking is a very clever use of the binary representation of a number. While unrelated to the article, I like to mention that doing option bitmasks like this is also used as the (current) most efficient solution to the famous Traveling Salesman problem.
How NS_OPTIONS
is bridged to OptionSet
In Swift, you don't have to create enums and manually generate these masks. The OptionSet
protocol abstracts the process of creating such bitmasks. Hooray!
But what about things that come from Objective-C? If the Clang importer section of the compiler detects a NS_OPTIONS
macro, it will convert it to a struct
that inherits from OptionSet
. OptionSet
itself inherits from RawRepresentable
, so the enum elements are converted into static let
properties that take a rawValue
. Here's an example of a converted JSONSerialization.WritingOptions
:
public struct WritingOptions: OptionSet {
public typealias RawValue = Int
public let rawValue: RawValue
public init(rawValue: RawValue) {
self.rawValue = rawValue
}
public static let prettyPrinted = WritingOptions(rawValue: 1 << 0)
public static let sortedKeys = WritingOptions(rawValue: 1 << 1)
public static let fragmentsAllowed = WritingOptions(rawValue: 1 << 2)
public static let withoutEscapingSlashes = WritingOptions(rawValue: 1 << 3)
}
How OptionSet is defined in the Standard Library
Although the OptionSet
is used with integers and bitmasking, it can, in theory, be used with anything else as the protocol is just a rawValue
and some algebra methods:
public protocol OptionSet: RawRepresentable, SetAlgebra {
associatedtype Element = Self // from SetAlgebra
init(rawValue: Self.RawValue)
public func union(_ other: Self) -> Self
public func intersection(_ other: Self) -> Self
public func symmetricDifference(_ other: Self) -> Self
public func contains(_ member: Self) -> Bool
public mutating func insert(_ newMember: Self.Element) -> (Bool, Self.Element)
public mutating func remove(_ member: Self.Element) -> Self.Element?
public mutating func update(with newMember: Self.Element) -> Self.Element?
However, implementing these methods isn't required -- if the rawValue
type is a number that isn't a floating-point, you'll get default implementations in the form of the bitmasking techniques seen before!
public mutating func formUnion(_ other: Self) {
self = Self(rawValue: self.rawValue | other.rawValue)
}
public mutating func formIntersection(_ other: Self) {
self = Self(rawValue: self.rawValue & other.rawValue)
}
OptionSet Syntax Sugars
We've seen how to work with OptionSet
through its methods, but where does this behavior come from?
options: [.prettyPrinted, .sortedKeys]
Even though the options argument is the struct itself, we're sending an array of them to the type -- and it works.
Interestingly enough this syntax sugar has nothing to do with OptionSet
itself. IT comes from an inner protocol of which it inherits from: SetAlgebra
.
The SetAlgebra
protocol is the backbone for mathematical set operations in the Standard Library and is used by types like Set
and OptionSet
:
public protocol SetAlgebra: Equatable {
associatedtype Element
public func union(_ other: Self) -> Self
public func intersection(_ other: Self) -> Self
public func symmetricDifference(_ other: Self) -> Self
public func contains(_ member: Self) -> Bool
...
(This looks very similar to OptionSet
itself, and is because OptionSet
can be seen as just a light abstraction of SetAlgebra
that provides the bitmasking features through a rawValue.)
The magic comes from this little extension:
extension SetAlgebra: ExpressibleByArrayLiteral {
public init(arrayLiteral: Element...) {
self.init(arrayLiteral)
}
public init<S: Sequence>(_ sequence: S) where S.Element == Element {
self.init()
for e in sequence { insert(e) }
}
}
SetAlgebra
inherits from ExpressibleByArrayLiteral
, which allow it to be represented directly as an array. This triggers an insert()
call for every element, which for our OptionSet
types means that we'll progressively trigger bitwise AND (|) operations, resulting in a final OptionSet
that represents the combination of all the options. Hooray!
How ExpressibleByArrayLiteral
works internally
The innards of the ExpressibleBy family are covered in detail in my previous article, but as a short explanation, we can say that when the type checker finds a literal being attributed to an explicit type, it checks if it conforms to the related ExpressibleBy
protocol and coverts the expression to the designated initializer from the protocol if it does (and throws a compilation error if it does not). For more details, check the linked article.
Conclusion
The OptionSet
protocol itself is hardly used nowadays, but its existence has some history and shows some of the importance of analyzing what you're developing in terms of performance. You can always use a regular Array or Set to define these options, but should you? For these case where you have a limited set of unique options, using a bitmask is considerably quicker and less resource-intensive, something which your user's device will be very grateful for :)