Building Collaborative AR Experiences

Back to WWDC19

Building Collaborative AR Experiences

With iOS 13, ARKit and RealityKit enable apps to establish shared AR experiences faster and easier than ever. Understand how collaborative sessions allow multiple devices to build a combined world map and share AR anchors and updates in real-time. Learn how to incorporate collaborative sessions into ARKit-based apps, then roll into SwiftStrike, an engaging and immersive multiplayer AR game built using RealityKit and Swift.

Resources
Related Videos

WWDC19
Good afternoon, everyone. Welcome to the session.
Today, I would like to talk about multiuser AR in ARKit 3.
My name is Kuen-han. I'm a ARKit engineer. I would like to show you all the enhancement we made in ARKit 3, so building a multiuser AR app becomes easy and intuitive.
So, you can focus on all the amazing content within your app.
Are you interested in bringing more people to the AR world? Then, this talk is for you.
Let's begin.
Building a shared AR experience is about synchronization.
Like the SwiftStrike video you are seeing here, not only we need to keep tracking the location of the bowling pin but, also, we need to track the location of the user and their interaction with the ball.
But sometimes, track that information can be tricky and complicated to make it right.
And that is what ARKit 3 want to solve for you.
In ARKit 3, we introduce collaborative session, which makes sharing 3D content location easy. And with RealityKit, all the game setups and physics simulation can be synchronized automatically under hood. So, you can focus on your content.
Let's look at today's agenda.
First, we're going to introduce the collaborative session, a new way to build multiuser app in ARKit 3.
Next, we're going to dive into some best practices for using ARAnchors, especially in the context of multiuser AR.
Last, David is going to introduce you the SwiftStrike.
By utilizing ARKit 3 and RealityKit, SwiftStrike provide a new level of multiuser AR game experience. Let's start from collaborative session.
To begin with, let's recap last year's multiuser AR feature we delivered in ARKit 2, Map Save and Load. In ARKit 2, we delivered Map Save and Load which is designed for persistent AR experience. The user can record their current AR experience and recontinue after loading the map. The same feature can also be used for multiuser AR.
Within this feature, we introduced a data structure called the ARWorldMap which contains map of 3D landmarks that are used for camera position tracking. And also, a list of ARAnchors which represents the 3D corners of your virtual content.
Within this example, we have a tabletop scene and we load the ARWorldMap on top of it. So, we have several three landmarks on the table and two ARAnchors.
When you use this feature for multiuser AR, each user loaded from the same ARWorldMap.
Then, ARKit will use the three landmarks within the ARWorldMap, try to localize the device itself against to the map.
Once ARKit managed to do that, the user can start to see the same virtual content at the right physical location.
This feature provides a good multiuser AR experience. If you already pre-map the environment and also have all the anchors you need saved in the ARWorldMap.
However, any new information that ARKit gathered afterwards won't be shared.
For instance, one of the user may keep exploring the table on one side and putting one extra anchor while the other user doing the same.
Those newly learned map information and ARAnchors won't be visible to all the users. So, that makes this feature as a one-time sharing AR experience and is not optimized for unseen environment beyond the pre-map area. And that is what collaborative session want to solve for you.
Collaborative session is mainly designed for the live multiuser AR experience. All the learned map information and anchors are shared continuously throughout the full session. That means any user can add anchor at any point of time and that will reflect on all the users' screen. And also, every user exploring the map together, that means they are benefit each other to have the best tracking and also most consistent tracking experience. That means this feature is friendly for unseen environment. You can also use this feature with or without the map. In addition, this feature use a decentralized design with peer-to-peer communication pattern, similar to MultipeerConnectivity. Therefore, there is not host user within the session.
Any user can come join the session or leave the session at any point of time without interrupting others' user AR experiences.
Let's see one example.
Here, we have two users running in collaborative sessions.
They both start their own AR experiences. At the beginning, they both do a small world exploration and put in one ARAnchor. As the user keeps exploring the environment, once they start seeing the area other use have explored before, the user can start seeing the ARAnchors added by others. In this case, the first user starts seeing the yellow cube, while the second user starts seeing the purple cube. Afterwards, any anchors that added by the users will immediately shows up on the other's screen.
Because the sharing happens live continuously, so there is no interruption for the users' AR experience. And also, even most of the existing multiuser AR app requires a host user within the session.
With collaborative session, now it enables the new possibility to build a decentralized multiuser AR app. Next, I'm going to dive into more about this decentralized design and how does it affect the current systems within collaborative session.
In this decentralized design, there is no host user within the session. That means each user can start their own AR experiences before they start to receiving each other. So, that means each user can have their own AR world coordinates. In this example, we have two user running collaborative session. Each user start doing the small world exploration and putting one Anchor on each side of the table.
Then, within the collaborative session the ARKit will transmit the so-called the collaboration data pushes a piece of your ARWorldMap information to all the other users and save it as external maps. Then, as the user keep exploring the environment, once they start to see the same area others have seen before, ARKit will utilize those 3D landmarks in the common area and try to localize itself against the external map. When they succeed, those external maps will merge locally into each user's local coordinate.
Note that at this point, user still have different world coordinate. But because ARAnchors is attached to the map so the user can still see the virtual object at the right physical location. And that is why it is important to use ARAnchor in collaborative session.
So, let's take a look how to use collaborative session in ARKit 3. In order to use collaborative session, first you need to make sure all the users are in the same networking layer. This networking layer can be either MultipeerConnectivity or any other alternative solution that provides reliable communication.
Once they are in the same networking layer, they can transmit information to each other. Then, you simply need to enable collaboration in your own AR session.
Once that is enabled, your AR session will periodically generate the collaboration data as I mentioned before. Then, it is the app's responsibility to transmit this data to all the other users. That is the only new code you need to add in ARKit 3 in order to use collaborative session. Let's take a look.
To begin with, you need to create a AR world tracking configuration. Then, you simply set the isCollaborationEnabled to true.
Then, you just run a session.run to run your AR session. If you are using RealityKit, this is the only new code you need to add to use collaborative session. If you are not using RealityKit, then you need to implement additional two delegate functions to transmit the collaboration data.
The first delegate function is ARSession didOutputcollaborationData. When your own AR session create this collaboration data, you need to transmit to all the other users. Here, we have one example using MultipeerConnectivity.
If your networking solution replies a failure to transmit this data, then it is your app's responsibility to transmit this data again to make sure the data is delivered.
Then, once you receive this data, you simply need to call arSession.update delegate function to pass this received data to your underlying AR session.
By implementing these two delegate functions, you complete the flow to transmit collaboration data.
Once the collaboration data transmission is running in the background, the transmission will happens throughout the whole session.
Then, for each user, they just start their own AR experience, as before. As I mentioned earlier, the shared AR experience will begin after the user can localize itself against the other user's map.
When that happen, your own AR session will start to receive the first ARAnchors added by others which can be served as indication of the beginning of your shared AR experiences.
Let's look at some new properties for ARAnchors in collaborative session. Within collaborative session, all the user created ARAnchors are lifetime are synchronized. That means the user can add or remove the anchors at any point of time and that will reflect to all the other users. And also, we add a session identifier to each ARAnchor which can be used as a indicator who is the original creator of this ARAnchor, so your app can react accordingly. Last, only the user created ARAnchors are shared. That excludes all the subclass ARAnchors, including ARImageAnchor, ARPlaneAnchor, and ARObjectAnchor. That also excludes the user subclass ARAnchor which were used to attach user data within Map Save and Load. At the beginning, you may think this is the drawback of this, of this collaborative session design. But don't worry.
This is where collaborative session and RealityKit plays well hand-in-hand. By using RealityKit, you can attach your user data to corresponding entity component. Once you attach your user entity to the corresponding ARAnchor, all those information will be synchronized under hood, including all the physical simulation, scene change, and sound effects.
For more information, you may want to check Introducing RealityKit and Reality Composer that we present in Tuesday. So, let's take a look about the code, how to use ARAnchor in collaborative session. Now, within collaborative session, when you receive ARSession didAddAnchors delegate function, you may want to check the session identifier to see whether this anchor is added by yourself or added by others.
Same thing when you receive the ARSession didRemove anchor. You may also want to check whether it's removed by yourself or by others, so your app can react accordingly. So, that summarize the ARAnchor which represents the 3D existence of your virtual object.
Further, in collaborative session it's also important to know other users' position. For that, we introduce a new anchor called ARParticipantAnchor.
ARParticipantAnchor represents other users' location within your own world coordinates.
It has a high frame rate update rate, same as other users' AR frame rate.
This ARParticipantAnchor is ultimately created by your own AR session when it managed to localize itself against the other user's map which means you can also use this ARParticipantAnchor as indication of the beginning of your shared AR experience.
By using ARAnchor and ARParticipantAnchor, you can correctly visualize other users' 3D content in your own world coordinate. So, that is how you would use collaborative session in ARKit 3. Let's look at some practical advice how to start a shared AR experience using collaborative session.
As I mentioned before, a shared AR experience will begin after each users localize their self in other users' map. That means they have to see the area other user have seen before. But sometimes, depends on users' motion, this could take time.
If you want the user to have shared experience faster, we have two advice.
First, it is recommended to have one of the user approach to the other user to have the same camera perspective. For instance, in this example, we have two users seeing the table but they are seeing in cross direction. Then, it's not likely for ARKit to localize them self to begin the shared AR experience. However, if you have two users stand side-by-side and looking at the same direction, then it is more likely for ARKit to localize and also to start a shared AR experience.
Second, while you are doing this, it is the best to have one your user stay in map-tracking status. That is, ARFrame WorldMappingStatus mapped.
By doing this, you make sure one of the user is actually seeing the 3D landmarks that are stored inside the ARWorldMap, therefore, when the other user approach it is more likely they can use those three landmarks to localize them self and start a shared AR experience.
Let's see one example.
Here, we have two users running in collaborative session. The first user simply do a small world exploration and adding one ARAnchor and stay in map-tracking status. While the other user simply approach the first user and see the same view, then they will start seeing the same anchors which is used to also indicate the beginning of your shared AR experience.
This device is also applicable for last year Map Save and Load. So, you may want to put this advice in your app to recommend the motion of two user so user can start their shared AR experience faster.
So, that summarize our introduction and suggestion for using collaborative session. Our API is simple and intuitive.
With RealityKit, you only need to add a few lines to enable the experience. I encourage you to give it a try and see the new multiuser AR experience in ARKit 3.
Next, I would like to talk about the best practices for using ARAnchors. As I mentioned before, ARAnchors are the main way to share virtual content within collaborative session.
Here, we have three simple but effective suggestions for using ARAnchor.
To begin with, let's look back the ARWorldMap.
As I mentioned before, each ARWorldMap consists collection with 3D map landmarks and, also, list of ARAnchors.
In addition, we also save collection of camera poses. Those camera poses represent the camera view when three landmarks are first observed. For instance, in this example, we have five camera poses where they are created when the three landmarks are first created.
So, with this camera view we can segment the three landmarks into different groups to present different parts of the map.
Once you have those views, when the user added one ARAnchor, the user will provide a global position of this ARAnchor respect to the world coordinate.
However, what is actually save within our ARWorldMap is the relative position of this ARAnchor to the one of the nearest view. It is this relative positions we're keeping inside the ARWorldMap and also transmit in the collaborative session. To make sure, even if issues are have different world coordinate, they still can see the ARAnchor at the right physical location.
So, once again, that is why it is important to use ARAnchors in collaborative session.
With this knowledge in mind, let's look at our best practices for using ARAnchor.
First, always respond to the ARAnchor update. As AR exploring the map more and more, it will optimize the 3D landmarks position by fine-tune the camera pose location. When that happen, your ARAnchor position will change as well because it is attached to the view.
So, you need to react to those anchor update function so you can change your virtual object position accordingly.
Second, when you place your virtual object, it is the best to place virtual object near to the ARAnchor but not far away from the ARAnchor. The reasoning is the same as before. When the anchor update happens, if you have virtual object far away from the ARAnchor, then you could experience a large and spatial update to your virtual object which is not desirable. So, it is the best to place your virtual object near to the ARAnchor, so you can represent the tracking quality correctly. Last, if you have multiple independent virtual objects, then it is recommended to use multiple ARAnchors so they will attach to different parts of the maps. Therefore, make sure virtual object to corresponding ARAnchor distance is small.
However, if you have a scenario where you have multiple virtual objects that you want to maintain their relative distance. Then, it is legitimate to use one single ARAnchor to represent them all as long as they are not far away from the anchors. So, that summarize our best practices for using ARAnchors.
By following those best practices, you can utilize the best tracking quality that ARKit provides to your app.
Next, we're going to move to David to talk about SwiftStrike.
Thanks. Well done.
Hi, everyone. I'm David and I'm here to talk to you about SwiftStrike which is the new multiplayer AR experience that we developed for the show here at WWDC 2019.
We were inspired by the work we did last year with SwiftShot and we wanted to build something new that leveraged RealityKit and ARKit 3 to deliver an all-new experience. We have a Tabletop version that's available as sample code now. And we're working on releasing the full version as sample code in the future.
If you want to, you can also go look at last year's session about SwiftShot. I'm going to talk a little bit about a couple of things we did in SwiftShot and compare and contrast what we're doing this year. So, you may want to take a look at that. Now, there's a lot that goes into building a game like SwiftStrike. There's sound design, asset design, animations, all kinds of things. I'm really going to focus on three areas here.
How we use RealityKit networking to get the shared experience up and running. The physics simulation to make sure that the game played and was fun. And also, a little bit about how we designed the game around the new capabilities of RealityKit and ARKit 3. So, first, RealityKit networking. RealityKit networking is based on the entity-component architecture that's built into RealityKit. As you write and change components, they're automatically synchronized across the network for you including all the physics state. You don't have to do any of that code yourself.
You can also define custom components for your own apps, application, or game, game logic. And it will take care of the synchronization for you, as well.
It used MultipeerConnectivity as the network layer. This is built into all iOS and macOS devices. It's easy to set up and get going.
And it, all you have to do is create that network session, hand it to the ARView object, and it takes care of the rest. That includes moving the collaborative mapping data that Kuen-han talked about with the new ARKit 3 collaborative mapping. And so, in SwiftStrike, we discovered that the best way to get things working you know, MultipeerConnectivity, RealityKit are all hostless, true peer-to-peer systems. We discovered that for our game to really get things working we needed to define one device as the host. It's the one that keeps track of what the state of the game is and how the physics is working. The other devices participate and provide input and also receive the information from the host about where the game is running.
So, again, about custom components in RealityKit. They're really easy to set up. You define a struct. You register the components before you instantiate the ARView. And you comply with the Swift Codable protocol. That provides all the information RealityKit needs to serialize your structure and send it across the network. So, here's one way we use that in the game.
We've discovered through playtesting in SwiftStrike, it was really important to make sure that both players were positioned in their starting spot when the game starts. Otherwise, it was possible for one player to get an advantage, be closer to the ball, and kind of get around the other user.
So, we have an object we call the Match object. It keeps track of whether or not each player is in the starting space or not and then, decides when to launch the ball. That state is also synchronized over the clients so that we can present instructions to them using UIKit as to where they need to stand.
The component also maintains a log of all those states. There's not many that it goes through, and it helps ensure that every client will see every state as it occurs. So, here's an example of that in work.
We wait until both players have gotten into position before we launch the ball.
Once they have, the ball launches and the game begins. Here's the code that we used to do that. First, is the component we define. The MatchStateComponent. It conforms to both the RealityKit component protocol and the Swift Codable protocol. We define a transition within it. And there's an array of transitions. So, each device gets a full log of all the Match states as they go forward. You can respond appropriately.
Before we get started, we register our component with AR, with RealityKit so that it is ready to start synchronizing it.
That's all we need to do. Now, changes to the component on the host are automatically synchronized over to the clients. On the client, we, we then use that. We create a, a MatchObserver object that watches that component for changes and then, broadcasts them out to all interested parties.
We're using the combined framework for this. It's a great alternative to using delegation and really gives you a lot of flexibility. I'd recommend looking in on some of the combined sessions from this year.
So, when we were doing SwiftStrike, we kind of started by bringing over a lot of the code from SwiftShot. And if you watched the session from last year, we spent a lot of time talking about how we synchronized the physics data, how we encoded it, how we really compressed it and made it tight to limit our network usage.
So, here's a list of most of the classes or types that we had to implement that. Well, RealityKit does the Physics, PhysicsSync for us. And using custom components, it can also synchronize game state for us.
So, we deleted all of that. Didn't need it, anymore. And then, we took a look at the messages that were left, which were really only about deciding whether to use collaborative mapping or world map sharing to get the game started. It only gets set once.
So, they don't need to be tightly encoded.
So, sad to say, I deleted the BitStream code. That's about 1500 lines of code that we were able to get rid of and that's lines of code that you won't have to write, anymore, thanks to RealityKit networking, to get your shared AR experience up and running. Next, let's talk about the physics simulation itself. That synchronization, as I said, is handled by RealityKit using its built-in physics engine.
On any of these, you can configure the physics properties by setting up components. We set up the rigid body. That defines the shape of the object in the scene. You define collision masks that configure which device, which objects in your scene can collide with which other objects. And then, also, the additional physical properties of the object. The mass, the friction, the restitution. All of those play into getting that right to get a great experience. In SwiftStrike, the host device owns the simulation. But the client devices provide information about where each individual player is to make the game happen. And I'll talk about how that works, later on. Now, in SwiftStrike, most of the objects are pretty simple. The ball is the sphere, the, the you know, the play surface is a plane. We put walls on the sides to make sure the ball doesn't fly out.
But there's one object that we really needed to get right, that's a little bit more complicated than that. And that's the bowling pin.
You know, really needed this to bounce true and sound right for the game to be compelling.
This is the just the wire frame of the 3D model our technical art has provided for us.
And this is then updated to make it really look great when it renders. But really, it's far too much data for the physics simulation.
We wanted to take this and make it a lot simpler while still maintaining a great bowling pin kind of feel. So, here's kind of what we did with that. We used a combination of the primitive shapes as that's part of RealityKit because it's networking the spheres at the top and in the middle. And then, we also built convex hulls around the pin to give it a base to stand on and, and the neck to bounce of off other things. You know, when you're doing a physics simulation, you want to be careful to use primitives whenever you can. If you can't, make sure that your convex hulls are relatively simple. This will give you the best performance.
So, we spent a lot of time tuning this to get the right combination. So here's what that looked like all together in the data.
But of course, on the court, you just see the great looking pin itself. RealityKit's physically based rendering really gives a good shine on it and makes it looks great. Thank you.
So, last thing, let's talk a little bit about the game design. And that has three areas. You know, designing for People Occlusion, building an on-site experience, and defining a control mechanism for the game itself.
When we learned about ARKit 3's Person Occlusion, we knew right away that we wanted this year's game to be a full-size experience. And we designed it so that you see Person Occlusion happening right from the start. When you're in starting position you see the ball in front of you, you see the other player, and you see the pins behind the other player. Right away, Person Occlusion is a big part of the game.
Previously, building an AR experience, you had to kind of make sure that you didn't get a person between the camera and the content. And then, so SwiftShot pretty much had to be a Tabletop game last year.
With SwiftStrike and Person Occlusion, now you've got a lot more possibilities as to how you want to include the virtual content in your, in your game. Now, a full-size game requires a full-size space to play it in. So, we worked with the facilities team and had a custom floor installed here at the Convention Center for people to play on.
The wood flooring not only evokes a bowling alley but also provides lots of great feature points for the ARKit tracking. So, you get a nice stable display.
We also used the image on the logo in the center of the court to position the gameboard property properly. ARKit image anchors are used to find that location, put the board there. So, every time it starts the game is correctly positioned and people are ready to go.
Now, for the AR localization, we're using a combination of ARKit world maps and collaborative data.
The players start with a world map on their device that they load and get localized very quickly. And then, they start sharing collaborative data after that. So, they get up fast with the Quick Start and then, maintain that over time as the devices share the data about the world around them. Finally, let's talk about the control mechanism.
With SwiftShot last year, we thought we had a pretty simple control mechanism. Right. Just tap to grab, pull to release.
We made it even simpler this year. You don't have to touch the screen; you just move it to push the ball.
We discovered through game playtesting that it was great if the faster push, faster movement would mean a bigger push on the ball. Give it a kick and make it really bounce past the other player. The other thing we discovered though in our playtesting was every once in a while, the ball would roll right through you. And that wasn't great.
So, instead, we added an invisible physics body located where the player is. And then, we discovered that we could just win the game by running around and knocking over all the, all the other players' pins.
So, instead, we're using collision masks to filter that out.
The ball will collide with the pins and with a person, but the pins and the person won't collide with each other. That was some of the ways that we used the networking system and the physics to really get, get a great, great experience.
Now, one of the things that we needed to solve then, is how do we get the input from this device moving around onto the device while maintaining control over when the paddle is active and how much force it's applying on the host. And so, we solved this using the ownership support within RealityKit. When the host starts the AR session, it creates a AnchorEntity as all content within RealityKit is all parented to an AnchorEntity that the host maintains ownership over.
When the Client joins, it adds another entity to the scene that we call the PlayerLocationEntity, using the subclassing support with RealityKit.
This maintains ownership by the Client, so the Client can update its location with every frame. And that's parented to the AnchorEntity so it appears in all the devices. As a child of that, we add the PaddleEntity. And it's parented to the PlayerLocationEntity. So, as the player moves around, the PlayerLocationEntity location gets updated. And that moves the PaddleEntity that the host maintains control over what, what actions the PaddleEntity takes. It can turn it on and off and make sure that the gameplay remains fun for everybody.
So, let's look at that, that, how all that came together with ARKit 3 and RealityKit to make a great gameplay experience.
Here again, is part of the video from the State of the Union on Monday showing everyone playing the game. And Adam is, once again, the winner. Now, when we, we're building this, we started to learn about the other things that were coming out this year. And one of those was Dark Mode in iOS. And we decided we needed to take that a step further. And so, we implemented Cosmic Mode in SwiftStrike. We swapped out the assets, darkened the video feed, and used some cards with billboarding effect to really give a glow effect. Let's take a look at that.
Here we go. Took me a few tries to get the winner on the first try.
So, that's SwiftStrike.
So, in summary of what we talked about today, Kuen-han covered the new collaborative session sharing feature in ARKit 3 and how that enables much easier localization and new shared experiences. We talked about the best ways to use ARAnchors to position content within your AR experience. And then, we talked about SwiftStrike, our new game for, for 2019.
We've done a Tabletop version using Reality Composer and the source for that is available now. You can get more information about that by looking at the ''Building AR Experiences with Reality Composer'' session. And we're planning to release the codes, the source for the full version of SwiftStrike with the features seen.
For more information, you can look at our URL for this session. Kuen-han and I will both be at the ARKit and RealityKit labs, immediately after the session at 3:00. And also, for those of you who have gotten really good as SwiftStrike, we're having a tournament on Friday at 12:30. So, we hope you all come and participate and see that. Thank you.
[ Applause ]

Explore Get Started

Stay Updated

Explore Platforms

Featured

Explore Technologies

Featured

Explore Community

Featured

Explore Documentation

Release Notes

Explore Downloads

Featured

Explore Support

Featured

Quick Links

Resources

Related Videos

WWDC19