Creating an Image Classifier Model

Train a machine learning model to classify images.


An image classifier is a machine learning model that’s been trained to recognize images. When you give it an image, it responds with a label for that image.

Diagram showing how an image classifier predicts the label "Giraffe" from an image of a giraffe.

You train an image classifier by showing it lots of examples of images you’ve already labeled. For example, you can train an image classifier to recognize safari animals by showing it a variety of photos of elephants, giraffes, lions, and so on.

Diagram showing how Create ML trains a model using collections of labeled images.

Prepare Your Data

Start by preparing the data that you’ll use to train and evaluate the classifier. Create a training data set from about 80% of the images you have for each label. Create a testing data set from the remaining images. Make sure that any given image appears in only one of these two sets.

Next, organize your data on disk to be compatible with one of the MLImageClassifier.DataSource types. One way to do that is to create a folder called Training Data, and another called Testing Data. In each folder, create subfolders using your labels as names. Then sort the images into the appropriate subfolders for each data set.

Diagram showing a folder called Training Data with subfolders that are named using the label corresponding to the class of images they contain. For example, all the cheetah images go into a subfolder called "Cheetah".

The exact label strings aren’t important, as long as they make sense to you. For example, you might use the label Cheetah for all the images of cheetahs. You don’t have to name the image files in any particular way or add metadata to them. You only need to put them into the folder with the right label.

Use at least 10 images per label for the training set, but more is always better. Also, balance the number of images for each label. For example, don’t use 10 images for Cheetah and 1000 images for Elephant.

The images can be in any format whose uniform type identifer conforms to public.image. This includes common formats like JPEG and PNG. The images don’t have to be the same size as each other, nor do they have to be any particular size, although it’s best to use images that are at least 299x299 pixels. If possible, train with images collected in a way that’s similar to how images will be collected for prediction.

Provide images with variety. For example, use images that show animals from many different angles and in different lighting conditions. A classifier trained on nearly identical images for a given label tends to have poorer performance than one trained on a more diverse image set.

Show an Image Classifier Builder in a Playground

With your data ready, make a new Xcode playground with a macOS target. Use the playground to create an MLImageClassifierBuilder instance and show it in the live view:

// Import CreateMLUI to train the image classifier in the UI.
// For other Create ML tasks, import CreateML instead.
import CreateMLUI 

let builder = MLImageClassifierBuilder()

Show the assistant editor in Xcode, and then run the playground. When you do this, the live view presents an image classifier:

Screenshot of an Xcode playground with the image classifier builder in the assistant editor.

Train the Image Classifier

Drag your Training Data folder from Finder onto the indicated location in the live view. When you do this, the training process starts and the image classifier shows its progress:

Screenshot of the image classifier in the process of training, showing a progress bar and the image currently being analyzed.

As a part of the training process, the image classifier automatically splits your training data into a training set and a validation set. These both affect training, but in different ways. Because the split is done randomly, you might get a different result each time you train the model.

When training finishes, the live view shows training and validation accuracies. These report how well the trained model classifies images from the corresponding sets. Because the model trained on these images, it typically does a good job classifying them.

Screenshot of the image classifier model accuracy output shown after training finishes.

Evaluate the Classifier’s Performance

Next, evaluate your trained model’s performance by testing it with images it’s never seen before. To do this, use the test data set that you created before you started training. Drag the Test Data folder into the live view, just as you did with the training data.

Screenshot showing the image classifier after training ready to receive testing data.

The model processes all of the images, making predictions for each. Because this is labeled data, the model can check its own predictions. It then adds the overall evaluation accuracy as the final metric in the UI.

Screenshot showing the image classifier after processing the testing data, including the evaluation model accuracy.

If the evaluation performance isn’t good enough, you may need to retrain with more data—for example, by introducing image augmentation—or alter some other training configuration. For information about how to do more detailed model evaluation, as well as strategies for improving model performance, see Improving Your Model’s Accuracy.

Save the Core ML Model

When your model performs well enough, save it so that you can use it in your app.

Give the classifier a meaningful name. Change the default ImageClassifier to be AnimalClassifier by changing the name in the UI. You can also add more information about the model, like the author and a short description. Click the disclosure triangle to reveal these metadata fields and fill in the details.

Screenshot showing the image classifier renamed as an animal classifier, and with metadata filled in.

Click Save. The model writes a file in .mlmodel format into the directory specified by the Where field.

Add the Model to Your App

Now add the trained model to an existing Core ML enabled app. You can use this model to replace the one that comes with the Classifying Images with Vision and Core ML sample code project. If you do this, the sample app works exactly the same as before, except that it recognizes and classifies animals according to the labels you’ve defined.

Open the sample code project in Xcode and drag your model file into the navigation pane. Once the model is part of the project, Xcode shows you the model metadata, along with other information, like the model class.

Screenshot showing the animal classifier model integrated into the image classifier sample code project.

To use the new model in code, you change only one line. The MobileNet model that comes with the project is instantiated in exactly one place in the ImageClassificationViewController class:

let model = try VNCoreMLModel(for: MobileNet().model)

Change this one line to use the new model class instead:

let model = try VNCoreMLModel(for: AnimalClassifier().model)

The models are interchangeable because both take an image as input and output a label. After your substitution, the sample app classifies images just as before, except it uses your model and its associated labels.

Automate the Process of Building an Image Classifier

You can use an MLImageClassifierBuilder instance to train a useful image classifier with very little code or machine learning expertise, as described in the sections above. However, if you need to script the process of training a model, use an MLImageClassifier instance instead. The steps are essentially the same: prepare data, train a model, evaluate performance, and save the result to a Core ML model file. The difference is that you do everything programmatically. For example, instead of dragging test data into a live view to evaluate the model’s performance, you initialize a MLImageClassifier.DataSource instance and provide it to the classifier’s evaluation(on:) method.

See Also

Image Classification

class MLImageClassifierBuilder

An Xcode playground UI that you use to train a model to classify images.

struct MLImageClassifier

A model you train to classify images programmatically.