Augmented Reality

ARKit, Apple's augmented reality (AR) technology, delivers immersive, engaging experiences that seamlessly blend virtual objects with the real world. In AR apps, the device's camera presents a live, onscreen view of the physical world. Three-dimensional virtual objects are superimposed over this view, creating the illusion that they actually exist. The user can reorient their device to explore the objects from different angles and, if appropriate for the experience, interact with objects using gestures and movement.

Designing an Engaging Experience

Use the entire display. Devote as much of the screen as possible to viewing and exploring the physical world and your app's virtual objects. Avoid cluttering the screen with controls and information that diminish the immersive experience.

Create convincing illusions when placing realistic objects. Include objects that appear to inhabit the physical environment in which they're placed. For best results, design detailed 3D assets with lifelike textures. Use the information ARKit provides to position objects on detected real-world surfaces, scale objects properly, reflect environmental lighting conditions on virtual objects, cast top-down diffuse object shadows on real-world surfaces, and update visuals as the camera's position changes. Make sure your app updates the scene 60 times per second so objects don’t appear to jump or flicker.

Consider how virtual objects with reflective surfaces show the environment. Reflections in ARKit are approximations based on the environment captured by the camera. Small or coarse reflective surfaces make it easier to maintain the illusion.

Anticipate that people will use your app in environments that aren’t optimal for AR. People may open your app in a location where there isn't much room to move around or there aren't large, flat surface areas. Try to anticipate scenarios that present challenges, and clearly communicate requirements or expectations to people up front. Consider offering varying sets of features for use in different environments.

Be mindful of the user's comfort. Holding a device at a certain distance or angle for a prolonged period can be fatiguing. Consider how people must hold their device when using your app, and strive for an enjoyable experience that doesn't cause discomfort. For example, by default, you could place objects at a distance that reduces the need to move the device closer to the object. A game could keep levels short and intermixed with brief periods of downtime.

Favor indirect controls that enable one-handed use of your app. Indirect controls are not part of the virtual environment. Instead they are displayed in “screen space” so that they appear attached to the user’s screen at a fixed location. Controls in screen space are easier to target and less likely to require users to adjust how they’re holding their device. Make controls large enough to easily and accurately target with one finger and use translucency to occlude as little of the underlying scene as possible. Examples include the controls to add a point or to take a photo in the Measure app.

If your app encourages user motion, introduce it gradually. In a game, the user shouldn't need to move out of the way to avoid a virtual projectile as soon as they enter AR. Give them time to adapt to the experience first. Then, progressively encourage movement.

Be mindful of the user's safety. Rapid, sweeping, or expansive motion can be dangerous in AR. Keep in mind the user's environment may include people, objects such as walls and furniture, and changes in the stability or height of the surface where they stand. Consider ways of making your app safe to operate; for example, a game could avoid encouraging large or sudden movements.

Use audio and haptic feedback to enhance the immersive experience. A sound effect or bump sensation is a great way to confirm that a virtual object has made contact with a physical surface or other virtual object. In an immersive game, background music can help envelop the user in the virtual world. For related guidance, see Audio and Haptic Feedback.

Diagram of a cube. The base of the cube is indicated with a grid, and the active side of the cube is outlined in blue. Arrows follow a continuous circle around the cube to the right, indicating the direction of motion. A green check mark below the diagram indicates this is the correct process.

Diagram of a cube. The base of the cube is indicated with a grid, and underneath the cube is the text "Rotate" shown in white inside a black oval. A red X below the diagram indicates this is not the correct process.

Wherever possible, provide hints in context. Placing a three-dimensional rotation indicator around an object, for example, is more intuitive than presenting text-based instructions in an overlay. Textual overlay hints may be warranted prior to surface detection, however, or if the user isn't responding to contextual hints.

Consider guiding people toward offscreen virtual objects. It can sometimes be hard to locate an object that’s positioned offscreen. If it seems like the user is having trouble finding an offscreen object, consider offering visual or audible cues. For example, if an object is offscreen to the left, you could show an indicator along the left edge of the screen so the user knows to aim the camera in that direction.

If you must display instructional text, use approachable terminology. AR is an advanced concept that may be intimidating to some users. To help make it approachable, avoid referring to technical, developer-oriented terms like ARKit, world detection, and tracking. Instead, use friendly, conversational terms that most people will understand.

Do Don't
Unable to find a surface. Try moving to the side or repositioning your phone. Unable to find a plane. Adjust tracking.
Tap a location to place the [name of object to be placed]. Tap a plane to anchor an object.
Try turning on more lights and moving around. Insufficient features.
Try moving your phone slower. Excessive motion detected.

Make important text readable. Display text used for labels, annotations, and instructions as if it is attached to the phone screen rather than in the virtual space. The text should face the user and be shown at the same size regardless of the distance of the labeled object.

An iPhone screen in landscape showing the corner of a room viewed through the camera. In the room are two AR objects: a desk and a chair. Each object has a label in white text enclosed in a black oval. The oval is attached to the object by a vertical line. The label in each object ends with two spaces and a greater-than sign to indicate the label can be tapped for more information.

Indicate if more information is available for important text. Use a visual indicator that fits with your app experience to show users that they can tap for more information. An example is the disclosure indicator next to the value for a measurement in the Measure app.

Keep textual information in the environment to a minimum. Display only the information that the person needs for your app experience; the Measure app, for instance, displays only the value of the measurement and the unit type.

An iPhone screen in landscape showing a full screen view with the detailed information for a chair. On the left side of the screen is an image of the chair, in the middle is a vertical separator line, and on the right is the model number, price, and size of the chair.

Consider displaying additional information in screen space. Text displayed in screen space can be easier to read when the device is in many different locations and orientations allowing the user to move freely while maintaining readability. In addition, screen space can be used to present other functionality. For example, the Measure app shows more detailed measurement information and a button for copying the measurement in a popover view.

An iPhone screen showing the corner of a room viewed through the camera. On the screen is a translucent overlay containing the surface detection indicator. The indicator is a white square with rounded corners projected into 3D space. A small iPhone is shown scanning back and forth along the base of the square. A circle of dots trailing the iPhone is used to emphasize the movement.

Use a visual indicator to indicate how to help with initialization and surface detection.

Indicate when initialization and surface detection is in progress and involve the user. Each time your app enters AR, an initialization process occurs during which your app evaluates the surroundings and detects surfaces. Surface detection time can vary based on a number of factors. If initialization takes more than a few seconds, indicate that your app is attempting to detect a surface and encourage people to speed up the process by slowly scanning their surroundings using a visual indicator that fits in with your app experience.

Designing Multiuser AR Apps

You can share your AR experience with multiple people at the same location or at different locations. The multiuser experience can be based on a shared surface, a shared physical object such as a toy, or a shared image. For example, SwiftShot lets two players in the same location knock down virtual blocks on a shared surface.

Consider showing options for both hosting and joining the experience on the same screen in a multiuser app that’s both host and client. Wait until the world is mapped and the virtual world is rendered before making the world available to participants.

On top an image showing a corner of a room. In the foreground are two iPhone devices. The AR host iPhone on the right shows a camera view of the corner of the room that contains a virtual desk. The client iPhone on the left shows the surface detection screen. A dotted line with an arrow indicates moving the client iPhone next to the host iPhone.

On the bottom left an image showing the client and host devices side by side pointed towards the corner of the room. The client iPhone still shows the surface detection screen.

On the bottom right an image showing the client and host devices side by side with both screens showing the corner of the room and the virtual desk.

Joining a multiuser experience by viewing the surface from the same direction as the host.

When joining a multiuser experience using the same surface or physical object, encourage clients to stand near the host and point their devices in a similar direction. Shared surfaces and physical objects are detected faster from a shared vantage point. Consider displaying an instructional graphic showing multiple devices aimed at a common virtual object.

Placing Virtual Objects

Diagram of a reticle consisting of four right-angled shapes framing a square surface. The square is shown in 3D perspective with the longest edge at the bottom.

Surface detection indicator

Diagram of a square-shaped reticle shown in 3D perspective with the longest edge at the bottom.

Object placement indicator

Diagram of a complex reticle consisting of a circular shape with markers indicating a center and diameter, inside right-angle shapes framing a square. The reticle is shown in 3D perspective with the longest edge on the bottom.

App-specific indicator

Help people understand when to locate a surface and place an object. Use a visual indicator to communicate that surface targeting mode is active. A trapezoidal reticle in the center of the screen, for example, helps people infer that they should find a horizontal or vertical flat surface. Once a surface is targeted, an indicator should change in appearance to suggest that object placement is now possible. If the indicator’s orientation follows the alignment of the detected surface, it can help people anticipate how the placed object will be aligned. Design visual indicators that feel like part of your app experience.

Respond appropriately when the user places an object. Accuracy is progressively refined (over a very short time) during surface detection. If the user taps the screen to place an object, place it immediately by using the information that's currently available. Then, when surface detection is complete, subtly refine the object's position. If an object is placed beyond the bounds of the detected surface, gently nudge the object back onto the surface.

Avoid trying to precisely align objects with the edges of detected surfaces. In AR, surface boundaries are approximations that may change as the user's surroundings are further analyzed.

User Interaction with Virtual Objects

Diagram showing a cube and a hand with the index finger touching the cube. There is a curved line intersecting the finger and cube to indicate the movement of the finger. A green check mark below the diagram indicates this is the correct control for rotating the cube.

Diagram showing a cube. Below the cube are two buttons, each with a circular arrow pointing in the opposite direction. A red X below the diagram indicates this is not the correct control for rotating a cube.

Favor direct manipulation over separate onscreen controls. It's more immersive and intuitive when a user can touch an object onscreen and interact with it directly, rather than interact with separate controls on a different part of the screen. Bear in mind, however, that direct manipulation can sometimes be confusing or difficult when the user is moving around.

Let people directly interact with virtual objects using standard, familiar gestures. For example, consider supporting a single-finger drag gesture for moving objects, and a two-finger rotation gesture for spinning objects. Rotation should generally occur relative to the surface on which an object rests—for example, an object placed on a horizontal surface would typically rotate around the object's vertical axis. For related guidance, see Gestures.

In general, keep interactions simple. Touch gestures are inherently two-dimensional, but an AR experience involves the three dimensions of the real world. Consider the following approaches to simplifying user interactions with virtual objects.

Diagram showing a sphere. The base of the sphere is on a grid. Two lines that are parallel to the grid and to each other pass through the center of the sphere. There is an arrow at the tip of each line indicating movement direction.

Limit movement to the two-dimensional surface on which the object rests.

Diagram showing a sphere. A dotted line runs vertically through the center of the sphere. An arrow wraps around the outside of the sphere and the vertical line from left to right indicating movement of the sphere around that line.

Limit object rotation to a single axis.

Respond to gestures within reasonable proximity of interactive virtual objects. It may be difficult for the user to precisely touch specific points on objects that are small, thin, or placed at a distance. When your app detects a gesture near an interactive object, it's usually best to assume the user wants to affect the object.

Consider whether user-initiated object scaling is necessary. Employ scaling when an object, like a toy or game character, doesn't have an intrinsic size and the user might want to see it larger or smaller. For an object with a finite size relative to the real world, like a piece of furniture, scaling is irrelevant if the item is placed at an accurate size. Scaling isn’t a remedy for adjusting the distance of an object—making an object larger to make it appear closer, for example, just results in a larger object that's still far away.

Be wary of potentially conflicting gestures. A two-finger pinch gesture, for example, is quite similar to a two-finger rotation gesture. If you implement two similar gestures like this, be sure to test your app and make sure they're interpreted properly.

Make sure virtual object movements are smooth. Objects shouldn't appear to jump when the user resizes them, rotates them, or moves them to a new location.

Consider how virtual objects behave as they are moved. Movement of objects should be consistent with the physics of the environment created by your app. Check that selected objects move through the environment. Consider what do if the height of surfaces over which the object is moved changes.

Explore even more engaging methods of interaction. Gestures aren't the only way for people to interact with virtual objects in AR. Your app can use other factors, like motion and proximity, to bring content to life. A game character, for example, could turn its head to look at the user as the user walks toward it.

Reacting to Imagery in the User's Environment

You can enhance an AR experience by using known imagery in the user’s environment to trigger the appearance of virtual content. Your app provides a set of 2D reference images or 3D reference objects, and ARKit indicates when and where it detects any of those images or objects in the user’s environment. For example, an app might recognize theater posters for a sci-fi film and then have virtual spaceships emerge from the posters and fly around the environment. Or, an app for an art museum can show a virtual tour guide by recognizing a sculpture.

Design and display reference images to optimize detection. When you provide reference images, you specify the physical size at which you expect to find those images in the user’s environment. Providing a more precise size measurement helps ARKit detect images faster and provide more accurate estimates of their real-world position. Detection performance and accuracy are best for flat rectangular images with high contrast and bold details. Avoid trying to detect images that appear on reflective or curved real-world surfaces.

Consider delaying removing virtual objects attached to tracked images when the tracked image first disappears. Image tracking can be interrupted briefly when the tracked image is still in the frame. Consider waiting 0.5 to 1 second before fading out or removing the virtual objects to prevent the virtual objects appearing to flicker.

Limit the number of reference images in use at one time. Image detection performance works best when ARKit looks for 25 or fewer distinct images in the user’s environment. If your use case calls for more than 25 reference images, you can change the set of active reference images based on context. For example, a museum guide app could use Core Location to determine which part of the museum the user is currently in, and then look only for images displayed in that area.

Limit the number of reference images requiring an accurate position. Updating the position of a reference image requires more resources. Use a tracked image when the image may move in the environment or when an attached animation or virtual object is small compared to the size of the image.

For developer guidance, see Recognizing Images in an AR Experience.

Handling Interruptions

Avoid unnecessarily interrupting the AR experience. ARKit can't track device position and orientation when AR isn’t active. One way to avoid interruptions is to let people adjust objects and settings within the experience. For example, if a user places a chair they’re considering purchasing into their living room and that chair is available in different fabrics, allow them to change the fabric without exiting AR.

Use relocalization to recover from other interruptions. ARKit can't track device position and orientation during an interruption, such as the user temporarily switching to another app or accepting a phone call. After the interruption, previously placed virtual objects are likely to appear in the wrong real-world positions. When you enable relocalization, ARKit attempts to recover the information needed to restore those virtual objects to their original real-world positions. This process requires the user to position and orient their device near where it was before the interruption. For developer guidance, see ARSessionObserver.

Consider hiding previously placed virtual objects until relocalization completes. During relocalization, ARKit attempts to reconcile its previous state with new observations of the user environment. Until this process completes, the positions of virtual objects are likely incorrect.

Allow users to cancel relocalization. If the user is unable to position and orient their device near where it was before an interruption, relocalization continues indefinitely without success. Guide the user to resume their session successfully, or provide a Reset button or other way for the user to restart the AR experience in case relocalization does not succeed.

Indicate when the front-facing camera is unable to track the face for more than 0.5 to 1 second. Use a visual indicator to indicate that the camera is no longer able to track the person’s face. If text instructions are required, keep them to a minimum.

Handling Problems

Let people reset the experience if it doesn’t meet their expectations. Don't force people to wait for conditions to improve or struggle with object placement. Give them a way to start over again and see if they have better results.

Diagram showing a corner of a brightly lit office. A desk, chair, filing cabinet, and part of a window are visible. A green check mark below the diagram indicates that this is sufficient lighting.

Sufficient lighting

Diagram showing the same office as the previous image with a black background. The corner of the room, desk, chair, filing cabinet, and window are all drawn as white line art. A red X below the diagram indicates that this is insufficient lighting.

Insufficient lighting

Suggest possible fixes if problems occur. Analysis of the user's environment and surface detection can fail or take too long for a variety of reasons—insufficient light, an overly reflective surface, a surface without enough detail, or too much camera motion. If your app is notified of these problems, offer suggestions for resolving them.

Problem Possible suggestion
Insufficient features detected Try turning on more lights and moving around.
Excessive motion detected Try moving your phone slower.
Surface detection takes too long Try moving around, turning on more lights, and making sure your phone is pointed at a sufficiently textured surface.

Offer AR features only on capable devices. If your app's primary purpose is AR, make your app available only to devices that support ARKit. If your app offers AR as a secondary feature—like a furniture catalog that includes product photos and allows some products to be viewed in AR—avoid displaying an error if the user tries to enter AR on an unsupported device. If the device doesn't support ARKit, don't present optional AR features in the first place. For developer guidance, see the arkit key in the UIRequiredDeviceCapabilities section of Information Property List Key Reference, and the isSupported property of ARConfiguration.

AR Glyph

Apps can display an AR glyph in controls that launch ARKit-based experiences. You can download this glyph in Resources.

The AR glyph.

A button containing the AR glyph and the text "View in AR".

Use the AR glyph as intended. The glyph should be used strictly for initiating an ARKit-based experience. Never alter the glyph (other than adjusting its size and color), use it for other purposes, or use it in conjunction with AR experiences not created using ARKit.

Maintain minimum clear space. The minimum amount of clear space required around an AR glyph is 10% of the glyph's height. Don’t let other elements infringe on this space or occlude the glyph in any way.

Diagram showing the AR glyph centered within a dotted square to indicate leaving space around the glyph.

AR Badges

Apps that include collections of products or other objects can use badging to identify specific items that can be viewed in AR using ARKit. For example, a department store app might use a badge to mark furniture that people can preview in their home before making a purchase.

Partial diagram of an iPhone view with the title "Home Decor". In the view are four gray squares each containing a picture of an item of furniture: a standing desk lamp, an articulating desk lamp, a filing cabinet, and a small chest of drawers. In the upper left corner of each square is the AR badge with the glyph and the text "AR".

Use the AR badges as intended and don’t alter them. You can download AR badges, available in collapsed and expanded form, in Resources. Use these images exclusively to identify products or other objects that can be viewed in AR using ARKit. Never alter the badges, change their color, use them for other purposes, or use them in conjunction with AR experiences not created with ARKit.

The AR badge with both the glyph and the text "AR".

AR badge

The glyph-only ARKit badge.

Glyph-only AR badge

The AR badge is preferable to the glyph-only badge. In general, use the glyph-only badge when space is constrained and won't accommodate the AR badge. Both badges work well at their default size.

Use badging only when your app contains a mixture of objects that can be viewed in AR and objects that cannot. If all objects in your app can be viewed in AR, then badging is redundant.

Keep badge placement consistent and clear. A badge looks best when displayed in one corner of an object's photo. Always place it in the same corner and make sure it's large enough to be seen clearly (but not so large that it occludes important detail in the photo).

Maintain minimum clear space. The minimum amount of clear space required around an AR badge is 10% of the badge's height. Other elements shouldn't infringe on this space and occlude the badge in any way.

Image of the badge with the AR glyph and text "AR" with a dotted square around the outside to indicate leaving space around the badge.

Image of the glyph-only AR badge with a dotted square around the outside to indicate leaving space around the badge.

Learn More

For developer guidance, see ARKit.