Detect faces in a front-camera AR experience, overlay virtual content, and animate facial expressions in real-time.
- iOS 11.0+
- Xcode 10.0+
This sample app presents a simple interface allowing you to choose between five augmented reality (AR) visualizations on devices with a TrueDepth front-facing camera.
An overlay of x/y/z axes indicating the ARKit coordinate system tracking the face (and in iOS 12, the position and orientation of each eye).
The face mesh provided by ARKit, showing automatic estimation of the real-world directional lighting environment, as well as a texture you can use to map 2D imagery onto the face.
Virtual 3D content that appears to attach to (and interact with) the user’s real face.
Live camera video texture-mapped onto the ARKit face mesh, with which you can create effects that appear to distort the user’s real face in 3D.
A simple robot character whose facial expression animates to match that of the user, showing how to use ARKit’s animation blend shape values to create experiences like the system Animoji app.
Use the tab bar to switch between these modes.
This sample code project requires:
An iOS device with front-facing TrueDepth camera:
iPhone X, iPhone XS, iPhone XS Max, or iPhone XR.
iPad Pro (11-inch) or iPad Pro (12.9-inch, 3rd generation).
iOS 11.0 or later.
Xcode 10.0 or later.
ARKit is not available in iOS Simulator.
Start a Face-Tracking Session in a SceneKit View
Like other uses of ARKit, face tracking requires configuring and running a session (an
ARSession object) and rendering the camera image together with virtual content in a view. This sample uses
ARSCNView to display 3D content with SceneKit, but you can also use SpriteKit or build your own renderer using Metal (see
ARSKView and Displaying an AR Experience with Metal).
Face tracking differs from other uses of ARKit in the class you use to configure the session. To enable face tracking, create an instance of
ARFace, configure its properties, and pass it to the
run(_: method of the AR session associated with your view, as shown here:
Before offering features that require a face-tracking AR session, check the
is property on the
ARFace class to determine whether the current device supports ARKit face tracking.
Track the Position and Orientation of a Face
When face tracking is active, ARKit automatically adds
ARFace objects to the running AR session, containing information about the user’s face, including its position and orientation. (ARKit detects and provides information about only face at a time. If multiple faces are present in the camera image, ARKit chooses the largest or most clearly recognizable face.)
In a SceneKit-based AR experience, you can add 3D content corresponding to a face anchor in the
renderer(_: delegate method. ARKit manages a SceneKit node for the anchor, and updates that node’s position and orientation on each frame, so any SceneKit content you add to that node automatically follows the position and orientation of the user’s face.
This example uses a convenience extension on
SCNReference to load content from an
.scn file in the app bundle. The
ARSCNView method provides that node to
ARSCNView, allowing ARKit to automatically adjust the node’s position and orientation to match the tracked face.
Use Face Geometry to Model the User’s Face
ARKit provides a coarse 3D mesh geometry matching the size, shape, topology, and current facial expression of the user’s face. ARKit also provides the
ARSCNFace class, offering an easy way to visualize this mesh in SceneKit.
Your AR experience can use this mesh to place or draw content that appears to attach to the face. For example, by applying a semitransparent texture to this geometry you could paint virtual tattoos or makeup onto the user’s skin.
To create a SceneKit face geometry, initialize an
ARSCNFace object with the Metal device your SceneKit view uses for rendering, and assign that geometry to the SceneKit node tracking the face anchor.
ARKit updates its face mesh conform to the shape of the user’s face, even as the user blinks, talks, and makes various expressions. To make the displayed face model follow the user’s expressions, retrieve an updated face meshes in the
renderer(_: delegate callback, then update the
ARSCNFace object in your scene to match by passing the new face mesh to its
Place 3D Content on the User’s Face
Another use of the face mesh that ARKit provides is to create occlusion geometry in your scene. An occlusion geometry is a 3D model that doesn’t render any visible content (allowing the camera image to show through), but obstructs the camera’s view of other virtual content in the scene.
This technique creates the illusion that the real face interacts with virtual objects, even though the face is a 2D camera image and the virtual content is a rendered 3D object. For example, if you place an occlusion geometry and virtual glasses on the user’s face, the face can obscure the frame of the glasses.
To create an occlusion geometry for the face, start by creating an
ARSCNFace object as in the previous example. However, instead of configuring that object’s SceneKit material with a visible appearance, set the material to render depth but not color during rendering:
Because the material renders depth, other objects rendered by SceneKit correctly appear in front of it or behind it. But because the material doesn’t render color, the camera image appears in its place.
The sample app combines this technique with a SceneKit object positioned in front of the user’s eyes, creating an effect where the user’s nose realistically obscures the object. This object uses physically-based materials, so it automatically benefits from the real-time directional lighting information that
Map Camera Video onto 3D Face Geometry
For additional creative uses of face tracking, you can texture-map the live 2D video feed from the camera onto the 3D geometry that ARKit provides. After mapping pixels in the camera video onto the corresponding points on ARKit’s face mesh, you can modify that mesh, creating illusions such as resizing or distorting the user’s face in 3D.
First, create an
ARSCNFace for the face and assign the camera image to its main material.
ARSCNView automatically sets the scene’s
background material to use the live video feed from the camera, so you can set the geometry to use the same material.
To correctly align the camera image to the face, you’ll also need to modify the texture coordinates that SceneKit uses for rendering the image on the geometry. One easy way to perform this mapping is with a SceneKit shader modifier (see the
SCNShadable protocol). The shader code here applies the coordinate system transformations needed to convert each vertex position in the mesh from 3D scene space to the 2D image space used by the video texture:
When you assign a shader code string to the
geometry entry point, SceneKit configures its renderer to automatically run that code on the GPU for each vertex in the mesh. This shader code also needs to know the intended orientation for the camera image, so the sample gets that from the ARKit
display method and passes it to the shader’s
Animate a Character with Blend Shapes
In addition to the face mesh shown in the earlier examples, ARKit also provides a more abstract representation of the user’s facial expressions. You can use this representation (called blend shapes) to control animation parameters for your own 2D or 3D assets, creating a character that follows the user’s real facial movements and expressions.
As a basic demonstration of blend shape animation, this sample includes a simple model of a robot character’s head, created using SceneKit primitive shapes. (See the
robot file in the source code.)
To get the user’s current facial expression, read the
blend dictionary from the face anchor in the
renderer(_: delegate callback. Then, examine the key-value pairs in that dictionary to calculate animation parameters for your 3D content and update that content accordingly.
There are more than 50 unique
ARFace coefficients, of which your app can use as few or as many as necessary to create the artistic effect you want. In this sample, the
Blend class performs this calculation, mapping the
eye parameters to one axis of the
scale factor of the robot’s eyes, and the
jaw parameter to offset the position of the robot’s jaw.