Faster Array Operations With CollectionOfOne in Swift

Faster Array Operations With CollectionOfOne in Swift

CollectionOfOne is a type inside the Swift Standard Library that defines a collection of a single element. While this sounds a bit weird, there's a good reason for this to exist. Let's check out why.

Operating on single elements

Personally, I am guilty of creating single-element arrays to make simple operations like this:

someArray += [someElement]

I used to do things like this for readability reasons (as you may or may not agree that this looks better than an append() call in some cases), but it turns out this is bad for multiple reasons. Here's what happens internally when you do something like this:

* The compiler translates the array literal into Array's ExpressibleByArrayLiteral initializer.

* The array initializer will create a buffer to hold the single element.

* The single element is copied into that buffer.

* += is a wrapper for appendContents(of:), which first increases the buffer's capacity based on the number of elements being added.

* Finally, the elements are copied from their original buffer to the one of the array they're being added to.

That's an enormous amount of stuff for something as simple as dealing with a single element! We're creating the entire stack for an array structure just for a simple append() when all we needed to do was get a hold of the single element and copy it into the original buffer.

Usually, this sort of innocent mistake is optimized by the compiler, but in this case, even with higher optimization levels, the compiler does not appear to know how to optimize the array allocation part out of single-element operations. If we extract the SIL out of that line of code, we'll get multiple references for array allocations:

// function_ref _allocateUninitializedArray<A>(_:)

CollectionOfOne - Collections, but not really

To treat cases like this where the operation relies on a collection but you're only dealing with one element, the CollectionOfOne type can be used to efficiently represent these elements without having to worry about anything else that would normally come from the Collection protocol.

The way this works is straight-forward: the CollectionOfOne type implements Collection, meaning that it's able to respond to iterations and more. But instead of implementing these things with buffers like in an Array, the CollectionOfOne type simply holds a reference to the element:

@frozen // trivial-implementation
public struct CollectionOfOne<Element> {
  @usableFromInline // trivial-implementation
  internal var _element: Element

  /// Creates an instance containing just the given element.
  ///
  /// - Parameter element: The element to store in the collection.
  @inlinable // trivial-implementation
  public init(_ element: Element) {
    self._element = element
  }
}

Because it doesn't need to rely on buffers and all of that, it can directly return the element when being iterated:

public mutating func next() -> Element? {
  let result = _elements
  _elements = nil
  return result
}

And because we're dealing with a single element, we don't need to worry about moving indexes or keeping track of element counts:

public var startIndex: Index {
  return 0
}

public var endIndex: Index {
  return 1
}

public func index(after i: Index) -> Index {
  _precondition(i == startIndex)
  return 1
}

public func index(before i: Index) -> Index {
  _precondition(i == endIndex)
  return 0
}

public var count: Int {
  return 1
}

This makes CollectionOfOne a very efficient way to represent a single element. If we modify the original example to use CollectionOfOne, we can give our app a small performance boost.

// Adding 10 million elements
someArray += [someElement]                // 8 seconds
someArray += CollectionOfOne(someElement) // 6.9 seconds

Note: Adding to an array was used as the example because I believe it's the most common way of reaching this problem, but this is actually a bad use case of CollectionOfOne. In this case, the best performance you can get is by using append() directly as that'll ensure the element is directly copied into the array's buffer without any overhead.

someArray.append(someElement) // 4.9 seconds

The cases where using CollectionOfOne is the best approach possible is when you are forced to operate on Collections. In the Standard Library for example, CollectionOfOne is used to insert elements in specific positions of an Array:

public mutating func insert(_ newElement: __owned Element, at i: Int) {
  _checkIndex(i)
  self.replaceSubrange(i..<i, with: CollectionOfOne(newElement))
}

Because replaceSubrange replaces a Collection with another one, cases where the replacement is only a single element can greatly benefit from using CollectionOfOne.

Conclusion

When the performance gain isn't very high, I believe you should always value readability and maintainability over the gain. However, its always a good idea to understand the possible options and their trade-offs. Even if you never use CollectionOfOne directly, knowing of its existence and why it exists may help you make better decisions in the future.

References and Good Reads

Array.swift
CollectionOfOne.swift
The Swift Source Code