This sample app uses an object detection model trained with Create ML to recognize the tops of dice and their values when the dice roll onto a flat surface.
After you run the object detection model on camera frames through Vision, the model interprets the result to identify when a roll has ended and what values the dice show.
Configure the Sample Code Project
Before you run the sample code project in Xcode, note the following:
You must run this sample code project on a physical device that uses iOS 13 or later. The project doesn’t work with Simulator.
The model works best on white dice with black pips. It may perform differently on dice that use other colors.
Add Inputs to the Request
In Vision, beginning in iOS 13, you can provide inputs other than images to a model by attaching an MLFeatureProvider object to your model. This is useful in the case of object detection when you want to specify different thresholds than the defaults.
As shown below, a feature provider can provide values for the iouThreshold and confidenceThreshold inputs to your object detection model.
To pass camera frames to your model, you first need to find the image orientation that corresponds to your device’s physical orientation. If the device’s orientation changes, the aspect ratio of the images can also change. Because you need to scale the bounding boxes for the detected objects back to your original image, you need to keep track of its size.
Finally, you invoke the VNImageRequestHandler with the image from the camera and information about the current orientation to make a prediction using your object detector.
Now that the app handles providing input data to your model, it’s time to interpret your model’s output.
Draw Bounding Boxes to Understand Your Model’s Behavior
You can get a better understanding of how well your detector performs by drawing bounding boxes around each object and its text label. The dice detection model detects the tops of dice and labels them according to the number of pips shown on each die’s top side.
When playing a dice game, users want to know the result of a roll. The app determines that the roll has ended by waiting for the dice’s positions and values to stabilize.
You can define the requirements of an ended roll as a comparison between two consecutive camera frames with the following conditions:
The number of detected dice must be the same.
For each detected die:
The bounding box must have not moved.
The identified class must match.
Based on these constraints, you can make a function that tells the app whether a roll has ended based on the current and the previous VNRecognizedObjectObservation objects.
Now for every prediction (meaning every new camera frame) you can check whether the roll has ended.
Display the Dice Values
Once the roll has ended, you can display the information on the screen or trigger some other behavior in the setting of a game.
This sample app shows the list of recognized values on screen, sorted from left-most to right-most VNRecognizedObjectObservation. It sorts the values based on where the dice are on the surface according to each observation’s bounding box coordinates. The app does this by sorting the observations by their bounding box’s centerX property in ascending order.