-
Explore machine learning on Apple platforms
Get started with an overview of machine learning frameworks on Apple platforms. Whether you're implementing your first ML model, or an ML expert, we'll offer guidance to help you select the right framework for your app's needs.
Chapters
- 0:00 - Introduction
- 2:05 - Apple Intelligence
- 3:55 - ML-powered APIs
- 7:16 - Running models on device
- 14:45 - Research
Resources
-
Download
Hi, I’m Anil Katti from the On Device Machine Learning team at Apple, and today, I am excited to give you an overview of machine learning products and offerings on Apple platforms. We have a lot of ground to cover, so, let’s dive in! Underlying many innovative features in our OS and apps are advanced machine learning and AI models. Gesture recognition for spatial computing, portrait mode in image capture, ECG and heart rate monitoring for health. All these features are made possible by machine learning and AI, and the models that power these features run entirely on device! Doing so allows these experiences to be highly interactive while keeping user data on device for enhanced privacy. On-device machine learning is possible due to powerful Apple silicon. Its unified memory combined with the ML accelerators in the CPU, GPU and Neural Engine allows for efficient and low latency inference. In this video, let’s look at how you could make the most of the powerful Apple hardware coupled with efficient software that only Apple can provide to build magical experiences for our users. Here’s what I am going to cover. First, I will talk about some of the new system-level features powered by Apple Intelligence that can readily bring intelligent features into your apps. Next, I’ll give an overview of machine learning-powered APIs to help you create unique experiences with models built into our OS. After that, I’ll go over some of the options for bringing other machine learning and AI models on device. And, lastly, we will take a peek into machine learning research at Apple.
As we take a tour of each topic, I’ll point you to videos that are deeper dives into the details if you're interested in learning more.
So let’s start by looking at the intelligence that is already baked into the OS.
This year's release brings exciting advancements with Apple Intelligence powering new features in apps and across the system.
Many of these will be available within your apps such as Writing Tools. Writing Tools helps users communicate even more effectively by rewriting their text for tone and clarity, proofreading for mistakes, and summarizing key points.
The text is processed using Apple Intelligence’s new language capabilities and seamlessly integrates with system text and web views your app is probably already using.
Please check out “Get started with Writing Tools” video to learn more about what’s possible and the best practices for your app to follow.
Next, Image Playground. With Image Playground, you can effortlessly integrate image creation features into your apps. You do not have to train a model or design safety guardrails.
With just a few lines of code, you can get a pre-built UI for users to create and embed images. Plus, since the model runs locally on device, users can create as many images as they want without worrying about the cost.
This year, we’re introducing significant improvements to Siri to make it sound more natural, contextually relevant, and more personal.
Check out “Bring your app to Siri” video to learn more about how to enhance your app with these new Siri capabilities using the App Intents framework. We've also revamped the Siri experience for key apps on our platforms making them more powerful, flexible, and intelligent. Users get to enjoy these amazing features with minimal changes in your app. If you want to offer your own intelligent features, we have a number of APIs and frameworks that can help you do that without having to deal with the model statically. So let’s take a look.
The Vision framework provides a range of capabilities for visual intelligence, including text extraction, face detection, body pose recognition, and much more.
To streamline integration, we're excited to release a new Swift API with Swift 6 support for Vision this year. In addition, Vision also introduces: hand pose detection in body pose requests and aesthetic score requests.
To learn more about how easily you can integrate visual understanding capabilities into your apps, check out "Discover Swift enhancements in the Vision framework” video.
Beyond Vision, there are additional frameworks that can allow you to segment and understand natural language, convert speech to text, and analyze and identify sounds.
We have some amazing videos on these APIs from the past WWDC, and I encourage you to review them if you're thinking about use cases in these domains.
This year, we are also introducing a new framework for language translation which you can integrate directly into your apps. Now, your app can perform direct language-to-language translation with the simple translation presentation UI that can be launched programmatically.
We're also providing an API that allows you to translate any text and display the output in any UI you’d like.
Using this API, you could also batch up requests and translate more text efficiently.
So please check out “Meet the Translation API” video to learn more, including how language assets are downloaded and managed on device.
Apple’s ML-powered APIs offer tons of capabilities that your app can readily take advantage of! When you need some model customization for your particular use case, Create ML is a great tool to begin with.
Create ML app gives you the ability to customize models powering our frameworks with your own data.
You start by choosing a template aligned with the task you wish to customize. It is then just a few clicks to train, evaluate and iterate on your model with your data.
In addition to Create ML app, the underlying Create ML and Create ML components frameworks offer you the capability to train models from within your application on all platforms.
New this year, Create ML app comes with an object tracking template which lets you train reference objects to anchor spatial experiences on visionOS Its now even easier to explore and inspect data annotations prior to training.
And the new time series classification and forecasting components are available in the framework for integration within your app.
Check out the “What’s new in Create ML” video to learn more about each of these topics.
Next, let’s talk about running your models on device. This is for slightly more advanced use cases like, for example, you might want to use a diffusion model in your app that you’ve fine-tuned and optimized or run a large language model that you've downloaded from an open source community like Hugging Face.
You can run a wide array of models across our devices including Whisper, Stable Diffusion, and Mistral. It just takes a couple steps to get the model ready to run in your app.
Let’s take a closer look at the developer workflow.
There are three distinct phases for deploying models on Apple devices. During the first phase, you're focused on defining the model architecture and training the model by providing the right training data.
Next, you convert the model into Core ML format for deployment. In this phase, you're also optimizing the model representation and parameters to achieve great performance while maintaining good accuracy Lastly, you write code to integrate with Apple frameworks to load and execute the prepared model.
Let’s look at each of these phases in more detail, starting with training.
You can take full advantage of Apple silicon and the unified memory architecture on Mac to architect and train high-performance models with training libraries such as PyTorch, TensorFlow, JAX and MLX. These all use Metal for efficient training on Apple GPUs. Check out “Train your machine learning and AI models on Apple GPUs” video to learn more about training models on macOS. In this video, we talk about: improved training efficiency for scaled dot product attention on Metal, how to integrate custom Metal operations in PyTorch, and newly-added mixed-precision support in JAX. Next, let’s go over the prepare phase. In this phase you are converting your trained model to the Core ML format in just a few steps using Core ML Tools. You can start with any PyTorch model. You can then use Core ML Tools and convert it into the Core ML format. At this point, you can also optimize the model for Apple hardware using a number of compression techniques in the Core ML Tools model optimization toolkit. With latest enhancements in Core ML Tools this year, we've introduced new model compression techniques, ability to represent state in models, along with the transformer- specific operations, and a way to have a model hold more than one function.
Please check “Bring your machine learning and AI models to Apple silicon” video to learn more about these features and understand the tradeoffs between storage size, latency and accuracy for model deployment.
Once you have the model converted and optimized, the next step is model integration.
It’s here that you write code to interface with OS frameworks to load the model and run inference.
Core ML is the gateway for deploying models on Apple devices and used by thousands of apps to enable amazing experiences for our users! It provides the performance that is critical for great user experience while simplifying the development workflow with Xcode integration. Core ML segments models across the CPU, GPU and Neural Engine automatically in order to maximize hardware utilization.
The "Deploy machine learning and AI models on-device with Core ML” video covers new Core ML features to help you run state-of-the-art generative AI models on device.
You’ll be introduced to the new MLTensor type designed to simplify the computational glue code stitching models together.
Learn how to manage key-value caches for efficient decoding of large language models with states and explore the use of functions in order to choose a specific style adapter in an image-generation model at runtime.
Lastly, performance reports have been updated to give you more insights into the cost of each operation of your model.
While Core ML is the go-to framework for deploying models on device, there may be scenarios where you need finer-grained control over machine learning task execution. For instance, If your app has demanding graphics workloads, Metal’s MPS Graph enables you to sequence ML tasks with other workloads, optimizing GPU utilization. Alternatively, when running real-time signal processing on the CPU, Accelerate's BNNS Graph API provides strict latency and memory management controls for your ML tasks.
These frameworks form part of Core ML’s foundation and are also directly accessible to you. Let’s look at each of these options in more detail starting with MPS Graph.
MPS Graph is built on top of Metal Performance Shaders, and it enables you to load your Core ML model or programmatically build, compile, and execute computational graphs using Metal.
Check out the “Accelerate machine learning with Metal” video to take a deep dive into efficient execution of transformers on the GPU, including new features in MPS Graph to improve compute and memory bandwidth.
Learn how the new MPS Graph strided ND array API helps speed up Fourier transforms, and see how the new MPS Graph viewer makes it easy to understand and gain insights into your model’s execution.
Next, BNNS Graph. BNNS Graph is a new API from the Accelerate framework to optimally run machine learning models on CPU. BNNS Graph has significantly improved performance over the older BNNS kernel-based API. It works with Core ML models and enables real-time and latency-sensitive inference on CPU along with strict control over memory allocations.
It is great for audio processing and similar use cases.
Check out “Support real-time ML inference on the CPU” video to learn more about BNNS Graph this year. Our frameworks and APIs provide everything you need to run inference on your machine learning and AI models locally, getting the full benefits of Apple silicon’s hardware acceleration...
and, along with our domain APIs, your app has access to range of cutting- edge machine learning tools and APIs on Apple platforms. Depending on your needs and user experience, you can start with the simple out-of-the box APIs powered by Apple models or go beyond that to use Apple frameworks to deploy machine learning and AI models directly.
We are dedicated to providing you the best ML and AI-powered foundation for building intelligent experiences in your apps. What I've covered so far is built into the OS and developer SDK. Let's now turn our attention to our last topic: research. Apple continues to push on the cutting edge in machine learning and AI. We’ve published hundreds of papers with novel approaches to AI models and on-device optimization. To facilitate further exploration, we have made sample code, data sets and research tools like MLX available via open source.
MLX is designed by Apple machine learning researchers for other researchers. It has a familiar and extensible API for researchers to explore new ideas on Apple silicon. It is built on top of a unified memory model which allows for efficient operations across CPU and GPU, and explorations in MLX can be done with Python, C++ or Swift. Check out the MLX GitHub page to learn more and contribute to the open source community. A more recent addition to the open source community is CoreNet, a powerful neural network toolkit designed for researchers and engineers. This versatile framework enables you to train a wide range of models, from standard to novel architectures, and scale them up or down to tackle diverse tasks. We also released OpenELM as part of CoreNet. OpenELM is an efficient language model family with open training and inference framework. The model has already been converted to MLX and Core ML formats by the open source community to run on Apple devices.
So that was an overview of machine learning on Apple platforms but we’ve only scratched the surface.
To summarize what we covered, leverage the built-in intelligence of our OS by utilizing standard UI elements to create seamless user experiences. Take it to the next level by customizing your app with ML-powered APIs and Create ML. Train or fine-tune models on powerful Mac GPUs using familiar frameworks like PyTorch powered by Metal. Prepare those models for deployment by optimizing them for Apple silicon using Core ML tools. Integrate those models to ship stunning experiences in your apps using Core ML, MPS Graph, and BNNS Graph APIs. And lastly, check out Apple's cutting-edge research initiatives, featuring open source frameworks and models. Now, let’s get to building amazing experiences for our users!
-
-
Looking for something specific? Enter a topic above and jump straight to the good stuff.