Learn how you can use Swift 5.7 to design advanced abstractions using protocols. We'll show you how to use existential types, explore how you can separate implementation from interface with opaque result types, and share the same-type requirements that can help you identify and guarantee relationships between concrete types.
To get the most out of this session, we recommend first watching “Embrace Swift generics" from WWDC22.
♪ ♪ Hi, I'm Slava from the swift compiler team. Welcome to Design Protocol Interfaces in Swift.
I'm going to pick up where the Embrace Swift Generics talk left off, and show you some advanced techniques for abstracting over concrete types and modeling type relationships using protocols. This talk will cover both existing language features, as well as some of the new capabilities introduced in Swift 5.7.
This talk has three main themes: First, I'll show you how protocols with associated types interact with existential 'any' types, by explaining how 'result type erasure' works. Next, I'll explain using opaque result types to improve encapsulation by separating interface from implementation. For the final topic, you will see how same-type requirements in protocols can model relationships between multiple different sets of concrete types.
Let's start by learning how protocols with associated types interact with existential types. Here, we have a data model with a pair of protocols, and four concrete types. There are two types of animals, chickens and cows, and two types of food, eggs and milk. Chickens produce eggs, and cows produce milk. To abstract over the production of food, I'm going to add a produce() method to the Animal protocol. You might remember from the 'Embrace swift generics' talk that that the best way to abstract the different return types of produce() on Cow and Chicken is to use an associated type. By using an associated type, we're declaring that: given some concrete type of Animal, calling produce() returns some specific type of Food, that depends on the concrete Animal type. We can show this relationship with a diagram. The protocol 'Self' type stands in for the actual concrete type conforming to the 'Animal' protocol. The 'Self' type has an associated 'Commodity' type, conforming to 'Food'. Let's look at the relationships between the concrete Chicken and Cow types, and the associated type diagram for the Animal protocol.
The Chicken type conforms to the Animal protocol with a CommodityType of 'Egg'. And the Cow type conforms to the Animal protocol with a CommodityType of 'Milk'. Now, let's say we have a farm full of animals. The 'animals' stored property on Farm is a heterogenous array of 'any Animal'. In embrace Swift generics, we saw how the 'any Animal' type has a box representation that has the ability to store any concrete type of animal dynamically. This strategy of using the same representation for different concrete types is called type erasure.
The produceCommodities() method maps over the array of animals, calling the produce() method on each one. The method looks simple, but we know that type erasure will eliminate static type relationships to the underlying type of animal, so it's worth digging deeper to understand why this code type checks.
The 'animal' parameter in the map() closure has type 'any Animal'. The return type of 'produce()' is an associated type. When you call a method returning an associated type on an existential type, the compiler will use type erasure to determine the result type of the call. Type erasure replaces these associated types with corresponding existential types that have equivalent constraints. We've erased the relationship between the concrete Animal type and the associated CommodityType by replacing them with 'any Animal' and 'any Food'. The type 'any Food' is called the upper bound of the associated CommodityType. Since the produce() method is called on an 'any Animal', the return value is type erased, giving us a value of type 'any Food'. This is exactly the type we expect here.
Let's take a closer look at how associated-type erasure works, which is a new feature in Swift 5.7. An associated type appearing in the result type of a protocol method – on the right-hand side of the arrow – is said to be in 'producing position', because calling the method will produce a value of this type. When we call this method on 'any Animal', we don't know the concrete result type at compile time, but we do know that it is a subtype of the upper bound. Here in this example, we're calling produce() on an 'any Animal' that holds a Cow at runtime. In our case, the produce() method on Cow returns Milk. Milk can be stored inside of an 'any Food', which is the upper bound of the associated CommodityType of the Animal protocol.
This is always safe, for all concrete types that conform to the Animal protocol.
On the other hand, let's think about what happens if the associated type appears in the parameter list of a method or initializer. Here, the eat() method on the Animal protocol has the associated FeedType in consuming position. We need to pass in a value of this type to call the method. Since the conversion goes in the other direction, type erasure cannot be performed. The upper bound existential type for the associated type does not safety convert to the actual concrete type, because the concrete type is unknown. Let's look at an example. Once again, we have an 'any Animal' storing a Cow. Suppose that the 'eat' method on Cow takes Hay. The upper bound of the Animal protocol's associated 'FeedType' is 'any AnimalFeed'. But given an arbitrary 'any AnimalFeed', there is no way to statically guarantee that it stores the 'Hay' concrete type. Type erasure does not allow us to work with associated types in consuming position. Instead, you must unbox the existential 'any' type by passing it to a function that takes an opaque 'some' type.
This type erasure behavior with associated types is actually similar to an existing language feature you may have seen in Swift 5.6. Consider a protocol for cloning reference types. This protocol defines a single clone() method, returning Self. When you call clone() on a value of type 'any Cloneable', the result type 'Self', is type erased to its upper bound. The upper bound of the Self type is always the protocol itself, so we get back a new value of type 'any Cloneable'. So to summarize: you can use 'any' to declare that the type of a value is an existential type that stores some concrete type conforming to a protocol. This even works with protocols that have associated types. When calling a protocol method with an associated type in producing position, the associated type is type-erased to its upper bound, which is another existential type that carries the associated type's constraints. Abstracting over concrete types isn't only useful for function inputs - it's useful for function outputs, too, so that concrete types are only visible from the implementation. Let's take a look at how to abstract away concrete result types to separate the essential interface of a piece of code from its implementation details, making static type assignments more modular and robust in the face of changes. Let's generalize the Animal protocol to allow feeding Animals. Animals get hungry, and when they're hungry they need to eat. Let's add an isHungry property to the Animal protocol. The feedAnimals() method on Farm will feed the subset of animals that are hungry. I've split off the computation of this subset of hungry animals into a hungryAnimals property. This initial implementation of hungryAnimals() uses the filter() method to select the subset of animals where the isHungry property is true. Calling filter() on an array of 'any Animal' returns a new array of 'any Animal'. Now you might notice that feedAnimals() only iterates over the result of hungryAnimals once, and then immediately discards this temporary array. This is inefficient if the farm contains a large number of hungry animals. One way to avoid this temporary allocation is to use the standard library's lazy collections feature. By replacing the call to 'filter' with 'lazy.filter', we get what is known as a lazy collection. A lazy collection has the same elements as the array returned by a plain call to 'filter', but it avoids the temporary allocation. However, now the type of the 'hungryAnimals' property must be declared as this rather complex concrete type, 'LazyFilterSequence of Array of any Animal'. This exposes an unnecessary implementation detail. The client, feedAnimals(), doesn't care that we used 'lazy.filter' in the implementation of 'hungryAnimals'; it only needs to know that it's getting some collection that it can iterate over. An opaque result type can be used to hide the complex concrete type behind the abstract interface of a Collection. Now clients calling 'hungryAnimals' only know they're getting some concrete type conforming to the Collection protocol, but they don't know the specific concrete type of collection.
However as written, this actually hides too much static type information from the client. We're declaring that hungryAnimals outputs some concrete type conforming to Collection, but we don't know anything about this Collection's Element type. Without the knowledge that the element type is 'any Animal', all we can do with the element type is pass it around; we can't call any of the methods of the Animal protocol. Let's focus on the opaque result type 'some Collection'. We can strike the right balance between hiding implementation details and exposing a sufficiently-rich interface by using a constrained opaque result type. Constrained opaque result types are new in Swift 5.7. A constrained opaque result type is written by applying type arguments in angle brackets after the protocol name. The Collection protocol has a single type argument, the Element type. Now once 'hungryAnimals' is declared with a constrained opaque result type, the fact that it is actually a 'LazyFilterSequence of an array of any Animal' is hidden from the client; but the client still has the knowledge that it is some concrete type conforming to Collection, whose Element associated type is equal to 'any Animal'. This is precisely the interface that we want here. Inside the for loop in 'feedAnimals()', the 'animal' variable has the type 'any Animal', allowing methods of the Animal protocol to be called on each hungry animal. This all works because the Collection protocol declares that the Element associated type is a primary associated type. You can declare your own protocols with primary associated types by naming one or more associated types in angle brackets after the protocol name, like this. The associated types that work best as primary associated types are those that are usually provided by the caller, such as an Element type of a collection, as opposed to implementation details, such as the collection's Iterator type. Often, you will see a correspondence between the primary associated types of a protocol, and the generic parameters of a concrete type conforming to this protocol. Here, you can see that the Element primary associated type of 'Collection' is implemented by the 'Element' generic parameter of Array and Set, two concrete types defined by the standard library that both conform to Collection. 'Collection of Element' can be used with opaque result types using the 'some' keyword, as well as with constrained existential types using the 'any' keyword. Before Swift 5.7, you would've needed to write your own data type to represent an existential type with a specific generic argument. Swift 5.7 builds this concept into the language with constrained existential types.
If we wanted hungryAnimals to have the option of whether to compute the hungryAnimals lazily or eagerly, using an opaque Collection of any Animal would result in an error that the function returns two different underlying types. We can fix this by instead returning 'any Collection of any Animal', signaling that this API can return different types across calls. The ability to constrain primary associated types gives opaque types and existential types a new level of expressivity. This can be used with various standard library protocols such as Collection; you can also declare your own protocols to have primary associated types.
Writing generic code using opaque types must rely on abstract type relationships. Let's discuss how to identify and guarantee necessary type relationships between multiple abstract types using related protocols.
We're going to add a new associated type to the Animal protocol for the concrete type of animal feed that this animal eats, together with an eat() method that tells the animal to consume this type of feed. To make things more interesting, I'm going to introduce an additional complication: before we can feed an animal, we must grow the appropriate type of crop, and harvest the crop to produce the feed. Here is the first set of concrete types. A cow eats hay, so given a cow, we first need to grow some hay. This gives us alfalfa, which is harvested and processed into hay, that the cow can eat. Here's the second set of concrete types. A chicken eats scratch, so if you bring me a chicken, we first need to grow a type of grain called millet that we harvest and process to produce chicken scratch, which we feed to our chicken. I want to abstract over these two sets of related concrete types, so I can implement the feedAnimal() method once, and have it feed both cows and chickens, as well as any new types of animals I might adopt in the future. Since feedAnimal() needs to work with the eat() method of the Animal protocol, which has an associated type in consuming position, I'm going to unbox the existential by declaring that the feedAnimal() method takes 'some Animal' as a parameter type. To start, I'll define a pair of protocols, AnimalFeed and Crop, using what we know about protocols and associated types so far. AnimalFeed has an associated CropType, which conforms to Crop, and Crop has an associated FeedType, which conforms to AnimalFeed. As before, we can look at a diagram of type parameters for each protocol. First, let's look at AnimalFeed. Every protocol has a 'Self' type, which stands for the concrete conforming type. Our protocol has an associated 'CropType', which conforms to Crop. The associated 'CropType' has a nested associated 'FeedType', which conforms to AnimalFeed, which has a nested associated 'CropType' conforming to Crop, and so on. In fact, this back-and-forth continues forever, with an infinite nesting of associated types that alternate between conforming to AnimalFeed and Crop.
With the Crop protocol, we have a similar situation, just shifted by one. We start with the 'Self' type, conforming to 'Crop', which has an associated 'FeedType', conforming to AnimalFeed. This has a nested associated 'CropType', conforming to Crop and so on...
To infinity. Let's see if these protocols correctly model the relationship between our concrete types. Recall that before we can feed an animal, we need to grow the crop that is then processed into the correct type of animal feed. grow() is a static method in the AnimalFeed protocol, which means it must be called directly on a type conforming to AnimalFeed, and not on a specific value whose type conforms to AnimalFeed. We need to write down a the name of a type conforming to AnimalFeed, but all we have is a specific value, of some type conforming to Animal, a different protocol. Well, we can get the type of this value, which we know is some type conforming to Animal, and Animal has an associated FeedType, which conforms to AnimalFeed.
This type can be used as the base of the method call grow(). The grow() method on AnimalFeed returns a value whose type is the nested associated CropType of AnimalFeed. We know that CropType conforms to Crop, so I can call harvest() on it. But what do I get back? harvest() is declared to return the associated FeedType of the Crop protocol. In our case, since the base of the call is (some Animal).FeedType.CropType, harvest() will output a value of type (some Animal).FeedType.CropType.FeedType. Unfortunately, this is the wrong type. The eat() method on (some Animal) expects (some Animal).FeedType, and not (some Animal).FeedType.CropType.FeedType. The program is not well-typed. These protocol definitions, as written, do not actually guarantee that if we start with a type of animal feed, and then grow and harvest this crop, we'll get back the same type of animal feed that we started with, which is what our animal expects to eat. Another way to think about it is that these protocol definitions are too general - they don't accurately model the desired relationship between our concrete types. To understand why, let's look at our Hay and Alfalfa types. When I grow hay, I get alfalfa, and when I harvest alfalfa, I get hay, and so on. Now imagine I'm refactoring my code, and I accidentally change the return type of the harvest() method on Alfalfa to return Scratch instead of Hay. After this accidental change, the concrete types still satisfy the requirements of the AnimalFeed and Crop protocols, even though we violate our desired invariant that growing and harvesting a crop produces the same type of animal feed that we started with. Let's look at the AnimalFeed protocol again. the real problem here is that in a sense, we have too many distinct associated types. We need to write down the fact that two of these associated types are actually the same concrete type. This will prevent incorrectly-written concrete types from conforming to our protocols; it will also to give the feedAnimal() method the guarantee that it needs. We can express the relationship between these associated types using a same-type requirement, written in a 'where' clause. A same-type requirement expresses a static guarantee that two different, possibly nested associated types must in fact be the same concrete type. Adding a same-type requirement here imposes a restriction on the concrete types that conform to the AnimalFeed protocol. In this same-type requirement here, we're declaring that `Self dot CropType dot FeedType' is the same type as 'Self'. what does this look like in our diagram? Well, here is how we can visualize it: Each concrete type conforming to AnimalFeed has a CropType, which conforms to Crop. However, the FeedType of this CropType, is not just some other type conforming to AnimalFeed, it is the same concrete type as the original AnimalFeed. Instead of an infinite tower of nested associated types, I've collapsed all relationships down to a single pair of related associated types. What about the 'Crop' protocol? Here, the Crop's FeedType has collapsed down to a pair of types, but we still have one too many associated types. We want to say that the Crop's FeedType's Crop Type is the same type as the Crop that we originally started with.
Now that these two protocols have been equipped with same-type requirements, we can revisit the 'feedAnimal()' method again. We start with the type of some Animal, as before. and we get the animal's feed type, which we know conforms to the AnimalFeed protocol. When we grow this crop, we get some animal's feed type's crop type. But now, when we harvest this crop, instead of getting yet another nested associated type, we get exactly the feed type that our animal expects, and the happy animal is now guaranteed to eat() the correct type of animal feed that we just grew. Finally, let's look at an associated type diagram for the Animal protocol, which pulls everything together we've seen so far.
Here are the two sets of conforming types: first, we have Cow, Hay, and Alfalfa. Second, we have Chicken, Scratch and Millet. Notice how our three protocols precisely model the relationships between each set of three concrete types. By understanding your data model, you can use same-type requirements to define equivalences between these different nested associated types. Generic code can then rely on these relationships when chaining together multiple calls to protocol requirements. During this session, we explored when type erasure is safe, and when we need to be in a context where type relationships are guaranteed. Then, we discussed how to strike the right balance between preserving rich type information and hiding implementation details using primary associated types, which can be used with both opaque result types and existential types. Finally, we saw how to identify and guarantee type relationships between sets of concrete types using same-type requirements across the protocols that represent those related sets of types. Thank you for joining me. I hope you have a great WWDC.
Looking for something specific? Enter a topic above and jump straight to the good stuff.
An error occurred when submitting your query. Please check your Internet connection and try again.