Swift ExpressibleBy protocols: What they are and how they work internally in the compiler
ExpressibleBy
represents a series of protocols in the Swift Standard library that allows you to instantiate objects directly from token literals, like a string, a number, a floating-point and so on, if the object can be "expressed" like that. For example, here's the regular way of creating an URL in Swift:
func getURL() -> URL
return URL(string: "https://swiftrocks.com")!
}
However, to prevent having to use this initializer everytime, you could say that it's possible to represent an URL directly from its URL string using ExpressibleByStringLiteral
:
extension URL: ExpressibleByStringLiteral {
public init(extendedGraphemeClusterLiteral value: String) {
self = URL(string: value)!
}
public init(stringLiteral value: String) {
self = URL(string: value)!
}
}
This allows us to refactor getURL()
to create an URL using nothing else but a string token:
func getURL() -> URL
return "https://swiftrocks.com"
}
The standard library contains the following ExpressibleBy
protocols:
* ExpressibleByNilLiteral
: Expressible by nil
.
* ExpressibleByIntegerLiteral
: Expressible by a number token like 10
.
* ExpressibleByFloatLiteral
: Expressible by a floating-point token like 2.5
.
* ExpressibleByBooleanLiteral
: Expressible by true/false
.
* ExpressibleByUnicodeScalarLiteral
: Expressible from a single unicode scalar. Usage examples of this are Character
and String
.
* ExpressibleByExtendedGraphemeClusterLiteral
: Similar to UnicodeScalar, but consists of a chain of scalars (a grapheme cluster) instead of a single one.
* ExpressibleByStringLiteral
: Expressible by a string token like "SwiftRocks"
.
* ExpressibleByArrayLiteral
: Expressible by an array token like [1,2,3]
.
* ExpressibleByDictionaryLiteral
: Expressible by a dictionary token like ["name": "SwiftRocks"]
.
To make it short, you can use these protocols to hide unnecessary implementation details and possibly ugly initializers of your more complex types. An example use case is how Apple's SourceKit-LSP uses them to represent arbitrary arguments -- because the Any
type does not conform to Codable
, a CommandArgumentType
enum is used to represent unknown arguments:
public enum CommandArgumentType: Hashable, ResponseType {
case null
case int(Int)
case bool(Bool)
case double(Double)
case string(String)
case array([CommandArgumentType])
case dictionary([String: CommandArgumentType])
}
However, because we're dealing with an enum, representing an argument will result in not-so-pretty lines of code:
func getCommandArguments() -> CommandArgumentType {
return .dictionary(["line": .int(2),
"column": .int(1),
"name": .string("refactor"),
"args": .array([.string("file://a.swift"), .string("open")])])
}
Fortunately, we can use ExpressibleBy
to provide better looking alternatives to the enum:
extension CommandArgumentType: ExpressibleByNilLiteral {
public init(nilLiteral _: ()) {
self = .null
}
}
extension CommandArgumentType: ExpressibleByIntegerLiteral {
public init(integerLiteral value: Int) {
self = .int(value)
}
}
extension CommandArgumentType: ExpressibleByBooleanLiteral {
public init(booleanLiteral value: Bool) {
self = .bool(value)
}
}
extension CommandArgumentType: ExpressibleByFloatLiteral {
public init(floatLiteral value: Double) {
self = .double(value)
}
}
extension CommandArgumentType: ExpressibleByStringLiteral {
public init(extendedGraphemeClusterLiteral value: String) {
self = .string(value)
}
public init(stringLiteral value: String) {
self = .string(value)
}
}
extension CommandArgumentType: ExpressibleByArrayLiteral {
public init(arrayLiteral elements: CommandArgumentType...) {
self = .array(elements)
}
}
extension CommandArgumentType: ExpressibleByDictionaryLiteral {
public init(dictionaryLiteral elements: (String, CommandArgumentType)...) {
let dict = [String: CommandArgumentType](elements, uniquingKeysWith: { first, _ in first })
self = .dictionary(dict)
}
}
Which allows us to rewrite getCommandArguments()
with easier to read tokens.
func getCommandArguments() -> CommandArgumentType {
return ["line": 2,
"column": 1,
"name": "refactor",
"args": ["file://a.swift", "open"]]
}
How it works internally
But how can a token become a full type? As with all compiler magic, we can uncover what's going on by intercepting Swift's compilation steps.
Using the first getURL()
method as an example, let's first see how Swift treats ExpressibleBy objects. If we compile the code manually using -emit-sil
argument, we can extract the Swift Intermediate Language (SIL) version of the code -- the final compilation step in Swift before LLVM takes the wheel.
swiftc -emit-sil geturl.swift
The output, which I edited to make it easier to read, looks like this:
sil hidden @$s3bla6getURL10Foundation0C0VyF : $@convention(thin) () -> @out URL { bb0(%0 : $*URL):
%1 = string_literal utf8 "https://swiftrocks.com"
// removed: creating a String type from the string_literal
// function_ref URL.init(stringLiteral:)
%8 = function_ref @$s10Foundation3URLV3blaE13stringLiteralACSS_tcfC : $@convention(method) (@owned String, @thin URL.Type) -> @out URL
%9 = apply %8(%0, %6, %7) : $@convention(method) (@owned String, @thin URL.Type) -> @out URL
%10 = tuple ()
return %10 : $()
} // end sil function '$s3bla6getURL10Foundation0C0VyF'
Here's what the method is doing:
1: Create a string_literal
token
2: Create a String
type from the literal
3: Call URL.init(stringLiteral:)
with the String
4: Return the URL
As one would expect, the compiler achieves this magic by replacing the String
line of code with the relevant ExpressibleBy
initializer. Hooray for compiler magic!
Now, to locate where this happens in the compiler, we can grep
the Swift source for mentions of "ExpressibleBy", which will point us to several places inside CSApply.cpp. In short, all usages of literals get converted to their ExpressibleBy equivalent, including the "expressibles that are literals themselves" (for example, an Int
is itself an ExpressibleByIntegerLiteral
). When Swift's type-checker reaches a literal, it gets a hold of an instance of the relevant protocol type and the name of the initializer, which can be determined from the literal we're looking at:
Expr *visitNilLiteralExpr(NilLiteralExpr *expr) {
auto type = simplifyType(cs.getType(expr));
auto &tc = cs.getTypeChecker();
auto *protocol = tc.getProtocol(expr->getLoc(),
KnownProtocolKind::ExpressibleByNilLiteral);
DeclName initName(tc.Context, DeclBaseName::createConstructor(),
{ tc.Context.Id_nilLiteral });
//...
}
With that info in hand, the type-checker calls convertLiteralInPlace
to replace the full expression with the equivalent ExpressibleBy initializer. The method itself does a lot of stuff, but there's something interesting to note here: If we take a look at KnownProtocols.def, we can see that all literals have default types:
EXPRESSIBLE_BY_LITERAL_PROTOCOL(ExpressibleByArrayLiteral, "Array", false)
EXPRESSIBLE_BY_LITERAL_PROTOCOL(ExpressibleByBooleanLiteral, "BooleanLiteralType", true)
EXPRESSIBLE_BY_LITERAL_PROTOCOL(ExpressibleByDictionaryLiteral, "Dictionary", false)
EXPRESSIBLE_BY_LITERAL_PROTOCOL(ExpressibleByExtendedGraphemeClusterLiteral, "ExtendedGraphemeClusterType", true)
EXPRESSIBLE_BY_LITERAL_PROTOCOL(ExpressibleByFloatLiteral, "FloatLiteralType", true)
EXPRESSIBLE_BY_LITERAL_PROTOCOL(ExpressibleByIntegerLiteral, "IntegerLiteralType", true)
EXPRESSIBLE_BY_LITERAL_PROTOCOL(ExpressibleByStringInterpolation, "StringLiteralType", true)
EXPRESSIBLE_BY_LITERAL_PROTOCOL(ExpressibleByStringLiteral, "StringLiteralType", true)
EXPRESSIBLE_BY_LITERAL_PROTOCOL(ExpressibleByNilLiteral, nullptr, false)
EXPRESSIBLE_BY_LITERAL_PROTOCOL(ExpressibleByUnicodeScalarLiteral, "UnicodeScalarType", true)
This means that if the expression has no type or has a type that doesn't conform to the protocol, the literal's true type will be assigned to the default's type conformance instead. For example, if I removed the conformance for getURL()
, the SIL code will reveal that the internal String
initializer is used instead:
func getURL() -> URL {
return String.init(_builtinStringLiteral: "https://swiftrocks.com")
}
This not only allows you to write untyped expressions like let foo = "bar"
, but it also serves for UI reasons - thanks to that, in a later pass the previous getURL()
example will result in our user-friendly Cannot convert value of type 'String' to specified type 'URL'
compilation error.
Follow me on my Twitter (@rockbruno_), and let me know of any suggestions and corrections you want to share.