Apple Developer Forums

Object Detection / Content Detection with YOLOv3 on VisionOS

Hi, i just wanna ask, Is it possible to run YOLOv3 on visionOS using the main camera to detect objects and show bounding boxes with labels in real-time? I’m wondering if camera access and custom models work for this, or if there’s a better way. Any tips?

8

0

252

Apr ’25

Vision Framework VNTrackObjectRequest: Minimum Valid Bounding Box Size Causing Internal Error (Code=9)

I'm developing a tennis ball tracking feature using Vision Framework in Swift, specifically utilizing VNDetectedObjectObservation and VNTrackObjectRequest. Occasionally (but not always), I receive the following runtime error: Failed to perform SequenceRequest: Error Domain=com.apple.Vision Code=9 "Internal error: unexpected tracked object bounding box size" UserInfo={NSLocalizedDescription=Internal error: unexpected tracked object bounding box size} From my investigation, I suspect the issue arises when the bounding box from the initial observation (VNDetectedObjectObservation) is too small. However, Apple's documentation doesn't clearly define the minimum bounding box size that's considered valid by VNTrackObjectRequest. Could someone clarify: What is the minimum acceptable bounding box width and height (normalized) that Vision Framework's VNTrackObjectRequest expects? Is there any recommended practice or official guidance for bounding box size validation before creating a tracking request? This information would be extremely helpful to reliably avoid this internal error. Thank you!

Media Technologies Photos & Camera ML Compute Machine Learning Camera AVFoundation

1

0

72

Apr ’25

What is the proper way to integrate a CoreML app into Xcode

Hi, I have been trying to integrate a CoreML model into Xcode. The model was made using tensorflow layers. I have included both the model info and a link to the app repository. I am mainly just really confused on why its not working. It seems to only be printing the result for case 1 (there are 4 cases labled, case 0, case 1, case 2, and case 3). If someone could help work me through this error that would be great! here is the link to the repository: https://github.com/ShivenKhurana1/Detect-to-Protect-App this file with the model code is called SecondView.swift and here is the model info: Input: conv2d_input-> image (color 224x224) Output: Identity -> MultiArray (Float32 1x4)

Machine Learning & AI Core ML Xcode Machine Learning Core ML

1

143

Apr ’25

Named Entity Recognition Model for Measurements

In an under-development MacOS & iOS app, I need to identify various measurements from OCR'ed text: length, weight, counts per inch, area, percentage. The unit type (e.g. UnitLength) needs to be identified as well as the measurement's unit (e.g. .inches) in order to convert the measurement to the app's internal standard (e.g. centimetres), the value of which is stored the relevant CoreData entity. The use of NLTagger and NLTokenizer is problematic because of the various representations of the measurements: e.g. "50g.", "50 g", "50 grams", "1 3/4 oz." Currently, I use a bespoke algorithm based on String contains and step-wise evaluation of characters, which is reasonably accurate but requires frequent updating as further representations are detected. I'm aware of the Python SpaCy model being capable of NER Measurement recognition, but am reluctant to incorporate a Python-based solution into a production app. (ref [https://developer.apple.com/forums/thread/30092]) My preference is for an open-source NER Measurement model that can be used as, or converted to, some form of a Swift compatible Machine Learning model. Does anyone know of such a model?

Machine Learning & AI General Swift Natural Language Machine Learning

0

87

Mar ’25

Core ML Model performance far lower on iOS 17 vs iOS 16 (iOS 17 not using Neural Engine)

Hello, I posted an issue on the coremltools GitHub about my Core ML models not performing as well on iOS 17 vs iOS 16 but I'm posting it here just in case. TL;DR The same model on the same device/chip performs far slower (doesn't use the Neural Engine) on iOS 17 compared to iOS 16. Longer description The following screenshots show the performance of the same model (a PyTorch computer vision model) on an iPhone SE 3rd gen and iPhone 13 Pro (both use the A15 Bionic). iOS 16 - iPhone SE 3rd Gen (A15 Bioinc) iOS 16 uses the ANE and results in fast prediction, load and compilation times. iOS 17 - iPhone 13 Pro (A15 Bionic) iOS 17 doesn't seem to use the ANE, thus the prediction, load and compilation times are all slower. Code To Reproduce The following is my code I'm using to export my PyTorch vision model (using coremltools). I've used the same code for the past few months with sensational results on iOS 16. # Convert to Core ML using the Unified Conversion API coreml_model = ct.convert( model=traced_model, inputs=[image_input], outputs=[ct.TensorType(name="output")], classifier_config=ct.ClassifierConfig(class_names), convert_to="neuralnetwork", # compute_precision=ct.precision.FLOAT16, compute_units=ct.ComputeUnit.ALL ) System environment: Xcode version: 15.0 coremltools version: 7.0.0 OS (e.g. MacOS version or Linux type): Linux Ubuntu 20.04 (for exporting), macOS 13.6 (for testing on Xcode) Any other relevant version information (e.g. PyTorch or TensorFlow version): PyTorch 2.0 Additional context This happens across "neuralnetwork" and "mlprogram" type models, neither use the ANE on iOS 17 but both use the ANE on iOS 16 If anyone has a similar experience, I'd love to hear more. Otherwise, if I'm doing something wrong for the exporting of models for iOS 17+, please let me know. Thank you!

Machine Learning & AI General ML Compute Machine Learning Core ML Create ML

1

1.8k

Mar ’25

Xcode AI Coding Assistance Option(s)

Not finding a lot on the Swift Assist technology announced at WWDC 2024. Does anyone know the latest status? Also, currently I use OpenAI's macOS app and its 'Work With...' functionality to assist with Xcode development, and this is okay, certainly saves copying code back and forth, but it seems like AI should be able to do a lot more to help with Xcode app development. I guess I'm looking at what people are doing with AI in Visual Studio, Cline, Cursor and other IDEs and tools like those and feel a bit left out working in Xcode. Please let me know if there are AI tools or techniques out there you use to help with your Xcode projects. Thanks in advance!

Machine Learning & AI General Developer Tools Xcode SwiftUI Machine Learning

6

0

11k

Mar ’25

Creating .mlmodel with Create ML Components

I have rewatched WWDC22 a few times , but still not getting full understanding how to get .mlmodel model file type from components . Example with banana ripeness is cool , but what need to be added to actually have output of .mlmodel , is somewhere full sample code for this type of modular project ? Code is from [https://developer.apple.com/videos/play/wwdc2022/10019) import CoreImage import CreateMLComponents struct ImageRegressor { static let trainingDataURL = URL(fileURLWithPath: "~/Desktop/bananas") static let parametersURL = URL(fileURLWithPath: "~/Desktop/parameters") static func train() async throws -> some Transformer<CIImage, Float> { let estimator = ImageFeaturePrint() .appending(LinearRegressor()) // File name example: banana-5.jpg let data = try AnnotatedFiles(labeledByNamesAt: trainingDataURL, separator: "-", index: 1, type: .image) .mapFeatures(ImageReader.read) .mapAnnotations({ Float($0)! }) let (training, validation) = data.randomSplit(by: 0.8) let transformer = try await estimator.fitted(to: training, validateOn: validation) try estimator.write(transformer, to: parametersURL) return transformer } } I have tried to run it in Mac OS command line type app, Swift-UI but most what I had as output was .pkg with "pipeline.json, parameters, optimizer.json, optimizer"

Machine Learning & AI Create ML Machine Learning Create ML

3

0

473

Mar ’25

CoreML inference on iOS HW uses only CPU on CoreMLTools imported Pytorch model

I have exported a Pytorch model into a CoreML mlpackage file and imported the model file into my iOS project. The model is a Music Source Separation model - running prediction on audio-spectrogram blocks and returning separated audio source spectrograms. Model produces correct results vs. desktop+GPU+Python but the inference on iPhone 15 Pro Max is really, really slow. Using Xcode model Performance tool I can see that the inference isn't automatically managed between compute units - all of it runs on CPU. The Performance tool notation hints all that ops should be supported by both the GPU and Neural Engine. One thing to note, that when initializing the model with MLModelConfiguration option .cpuAndGPU or .cpuAndNeuralEngine there is an error in Xcode console: `Error(s) occurred compiling MIL to BNNS graph: [CreateBnnsGraphProgramFromMIL]: Failed to determine convolution kernel at location at /private/var/containers/Bundle/Application/2E3C4AFF-1FA4-4C95-AAE4-ECEBC0FB0BF9/mymss.app/mymss.mlmodelc/model.mil:2453:12 @ CreateBnnsGraphProgramFromMIL` Before going back hammering the model in Python, are there any tips/strategies I could try in CoreMLTools export phase or in configuring the model for prediction on iOS? My export toolchain is currently Linux with CoreMLTools v8.1, export target iOS16.

Machine Learning & AI Core ML Machine Learning Core ML

2

0

680

Feb ’25

CoreML Conversion Display Issues

Hello! I have a TrackNet model that I have converted to CoreML (.mlpackage) using coremltools, and the conversion process appears to go smoothly as I get the .mlpackage file I am looking for with the weights and model.mlmodel file in the folder. However, when I drag it into Xcode, it just shows up as 4 script tags instead of the model "interface" that is typically expected. I initially was concerned that my model was not compatible with CoreML, but upon logging the conversions, everything seems to be converted properly. I have some code that may be relevant in debugging this issue: How I use the model: model = BallTrackerNet() # this is the model architecture which will be referenced later device = self.device # cpu model.load_state_dict(torch.load("models/balltrackerbest.pt", map_location=device)) # balltrackerbest is the weights model = model.to(device) model.eval() Here is the BallTrackerNet() model itself import torch.nn as nn import torch class ConvBlock(nn.Module): def __init__(self, in_channels, out_channels, kernel_size=3, pad=1, stride=1, bias=True): super().__init__() self.block = nn.Sequential( nn.Conv2d(in_channels, out_channels, kernel_size, stride=stride, padding=pad, bias=bias), nn.ReLU(), nn.BatchNorm2d(out_channels) ) def forward(self, x): return self.block(x) class BallTrackerNet(nn.Module): def __init__(self, out_channels=256): super().__init__() self.out_channels = out_channels self.conv1 = ConvBlock(in_channels=9, out_channels=64) self.conv2 = ConvBlock(in_channels=64, out_channels=64) self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2) self.conv3 = ConvBlock(in_channels=64, out_channels=128) self.conv4 = ConvBlock(in_channels=128, out_channels=128) self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2) self.conv5 = ConvBlock(in_channels=128, out_channels=256) self.conv6 = ConvBlock(in_channels=256, out_channels=256) self.conv7 = ConvBlock(in_channels=256, out_channels=256) self.pool3 = nn.MaxPool2d(kernel_size=2, stride=2) self.conv8 = ConvBlock(in_channels=256, out_channels=512) self.conv9 = ConvBlock(in_channels=512, out_channels=512) self.conv10 = ConvBlock(in_channels=512, out_channels=512) self.ups1 = nn.Upsample(scale_factor=2) self.conv11 = ConvBlock(in_channels=512, out_channels=256) self.conv12 = ConvBlock(in_channels=256, out_channels=256) self.conv13 = ConvBlock(in_channels=256, out_channels=256) self.ups2 = nn.Upsample(scale_factor=2) self.conv14 = ConvBlock(in_channels=256, out_channels=128) self.conv15 = ConvBlock(in_channels=128, out_channels=128) self.ups3 = nn.Upsample(scale_factor=2) self.conv16 = ConvBlock(in_channels=128, out_channels=64) self.conv17 = ConvBlock(in_channels=64, out_channels=64) self.conv18 = ConvBlock(in_channels=64, out_channels=self.out_channels) self.softmax = nn.Softmax(dim=1) self._init_weights() def forward(self, x, testing=False): batch_size = x.size(0) x = self.conv1(x) x = self.conv2(x) x = self.pool1(x) x = self.conv3(x) x = self.conv4(x) x = self.pool2(x) x = self.conv5(x) x = self.conv6(x) x = self.conv7(x) x = self.pool3(x) x = self.conv8(x) x = self.conv9(x) x = self.conv10(x) x = self.ups1(x) x = self.conv11(x) x = self.conv12(x) x = self.conv13(x) x = self.ups2(x) x = self.conv14(x) x = self.conv15(x) x = self.ups3(x) x = self.conv16(x) x = self.conv17(x) x = self.conv18(x) # x = self.softmax(x) out = x.reshape(batch_size, self.out_channels, -1) if testing: out = self.softmax(out) return out def _init_weights(self): for module in self.modules(): if isinstance(module, nn.Conv2d): nn.init.uniform_(module.weight, -0.05, 0.05) if module.bias is not None: nn.init.constant_(module.bias, 0) elif isinstance(module, nn.BatchNorm2d): nn.init.constant_(module.weight, 1) nn.init.constant_(module.bias, 0) I have been struggling with this conversion for almost 2 weeks now so any help, ideas or pointers would be greatly appreciated! Thanks! Michael

Machine Learning & AI Core ML Xcode Machine Learning Core ML

13

0

1k

Jan ’25

Can't apply compression techniques on my CoreML Object Detection model.

import coremltools as ct from coremltools.models.neural_network import quantization_utils # load full precision model model_fp32 = ct.models.MLModel(modelPath) model_fp16 = quantization_utils.quantize_weights(model_fp32, nbits=16) model_fp16.save("reduced-model.mlmodel") I'm testing it with the model from one of Apple's source codes(GameBoardDetector), and it works fine, reduces the model size by half. But there are several problems with my model(trained on CreateML app using Full Network): Quantizing to float 16 does not work(new file gets created with reduced only 0.1mb). Quantizing to below 16 values cause errors, and no file gets created. Here are additional metadata and precisions of models. Working model's additional metadata and precision: Mine's additional metadata and precision:

Machine Learning & AI Core ML Machine Learning Core ML Create ML

2

0

549

Jan ’25

Filtering Contours from Vision

Hello, I need help I desire to select/filter the contours on an image. Not sure best way to do that. Idea select/filter for bottom left most contour? see image attached please. also will need end points or court corners. and need contour to be fine line, smooth, ie accurate of the court end line and side lines only is desired. thank you :) or also glad for other ideas or api to determine the lines/corners I need. glad to email to discuss if that is better/easier actually prefer that. thanks.

Developer Tools & Services Xcode Vision Machine Learning

3

0

444

Jan ’25

CreateML

I'm trying to use the Spatial model to perform Object Tracking on a .usdz file that I create. After loading the file, which I can view correctly in the console, I start the training. Initially, I notice that the disk usage on my PC increases. After several GB, the usage stops, but the training progress remains for hours at 0.00% with the message "About 8hr." How can I understand what the issue is? Has anyone else experienced the same problem? Thanks Diego

Machine Learning & AI Create ML Machine Learning visionOS

1

590

Jan ’25

Create ML Trouble Loading CSV to Train Word Tagger With Commas in Training Data

I'm using Numbers to build a spreadsheet that I'm exporting as a CSV. I then import this file into Create ML to train a word tagger model. Everything has been working fine for all the models I've trained so far, but now I'm coming across a use case that has been breaking the import process: commas within the training data. This is a case that none of Apple's examples show. My project takes Navajo text that has been tokenized by syllables and labels the parts-of-speech. Case that works... Raw text: Naaltsoos yídéeshtah. Tokens column: Naal,tsoos, ,yí,déesh,tah,. Labels column: NObj,NObj,Space,Verb,Verb,VStem,Punct Case that breaks... Raw text: óola, béésh łigaii, tłʼoh naadą́ą́ʼ, wáin, akʼah, dóó á,shįįh Tokens column with tokenized text (commas quoted): óo,la,",", ,béésh, ,łi,gaii,",", ,tłʼoh, ,naa,dą́ą́ʼ,",", ,wáin,",", ,a,kʼah,",", ,dóó, ,á,shįįh (Create ML reports mismatched columns) Tokens column with tokenized text (commas escaped): óo,la,\,, ,béésh, ,łi,gaii,\,, ,tłʼoh, ,naa,dą́ą́ʼ,\,, ,wáin,\,, ,a,kʼah,\,, ,dóó, ,á,shįįh (Create ML reports mismatched columns) Tokens column with tokenized text (commas escape-quoted): óo,la,\",\", ,béésh, ,łi,gaii,\",\", ,tłʼoh, ,naa,dą́ą́ʼ,\",\", ,wáin,\",\", ,a,kʼah,\",\", ,dóó, ,á,shįįh (record not detected by Create ML) Tokens column with tokenized text (commas escape-quoted): óo,la,"","", ,béésh, ,łi,gaii,"","", ,tłʼoh, ,naa,dą́ą́ʼ,"","", ,wáin,"","", ,a,kʼah,"","", ,dóó, ,á,shįįh (Create ML reports mismatched columns) Labels column: NSub,NSub,Punct,Space,NSub,Space,NSub,NSub,Punct,Space,NSub,Space,NSub,NSub,Punct,Space,NSub,Punct,Space,NSub,NSub,Punct,Space,Conj,Space,NSub,NSub Sample From Spreadsheet Solution Needed It's simple enough to escape commas within CSV files, but the format needed by Create ML essentially combines entire CSV records into single columns, so I'm ending up needing a CSV record that contains a mixture of commas to use for parsing and ones to use as character literals. That's where this gets complicated. For this particular use case (which seems like it would frequently arise when training a word tagger model), how should I properly escape a comma literal?

Machine Learning & AI Create ML Natural Language Machine Learning Create ML TabularData

6

0

745

Jan ’25

Problems creating a PipelineRegressor from a PyTorch converted model

I am trying to create a Pipeline with 3 sub-models: a Feature Vectorizer -> a NN regressor converted from PyTorch -> a Feature Extractor (to convert the output tensor to a Double value). The pipeline works fine when I use just a Vectorizer and an Extractor, this is the code: vectorizer = models.feature_vectorizer.create_feature_vectorizer( input_features=["windSpeed", "theoreticalPowerCurve", "windDirection"], # Multiple input features output_feature_name="input" ) preProc_spec = vectorizer[0] ct.utils.convert_double_to_float_multiarray_type(preProc_spec) extractor = models.array_feature_extractor.create_array_feature_extractor( input_features=[("input",datatypes.Array(3,))], # Multiple input features output_name="output", extract_indices = 1 ) ct.utils.convert_double_to_float_multiarray_type(extractor) pipeline_network = pipeline.PipelineRegressor ( input_features = ["windSpeed", "theoreticalPowerCurve", "windDirection"], output_features=["output"] ) pipeline_network.add_model(preProc_spec) pipeline_network.add_model(extractor) ct.utils.convert_double_to_float_multiarray_type(pipeline_network.spec) ct.utils.save_spec(pipeline_network.spec,"Final.mlpackage") This model works ok. I created a regression NN using PyTorch and converted to Core ML either import torch import torch.nn as nn class TurbinePowerModel(nn.Module): def __init__(self): super().__init__() self.linear1 = nn.Linear(3, 4) self.activation1 = nn.ReLU() #self.linear2 = nn.Linear(5, 4) #self.activation2 = nn.ReLU() self.output = nn.Linear(4, 1) def forward(self, x): #x = F.normalize(x, dim = 0) x = self.linear1(x) x = self.activation1(x) # x = self.linear2(x) # x = self.activation2(x) x = self.output(x) return x def forward_inference(self, windSpeed,theoreticalPowerCurve,windDirection): input_tensor = torch.tensor([windSpeed, theoreticalPowerCurve, windDirection], dtype=torch.float32) return self.forward(input_tensor) model = torch.load('TurbinePowerRegression-1layer.pt', weights_only=False) import coremltools as ct print(ct.__version__) import pandas as pd from sklearn.preprocessing import StandardScaler df = pd.read_csv('T1_clean.csv',delimiter=';') X = df[['WindSpeed','TheoreticalPowerCurve','WindDirection']] y = df[['ActivePower']] scaler = StandardScaler() X = scaler.fit_transform(X) y = scaler.fit_transform(y) X_tensor = torch.tensor(X, dtype=torch.float32) y_tensor = torch.tensor(y, dtype=torch.float32) traced_model = torch.jit.trace(model, X_tensor[0]) mlmodel = ct.convert( traced_model, inputs=[ct.TensorType(name="input", shape=X_tensor[0].shape)], classifier_config=None # Optional, for classification tasks ) mlmodel.save("TurbineBase.mlpackage") This model has a Multiarray(Float 32 3) as input and a Multiarray(Float32 1) as output. When I try to include it in the middle of the pipeline (Adjusting the output and input types of the other models accordingly), the process runs ok, but I have the following error when opening the generated model on Xcode: What's is missing on the models. How can I set or adjust this metadata properly? Thanks!!!

Machine Learning & AI Core ML Machine Learning Core ML

1

0

580

Dec ’24

How to Train and Deploy PyTorch Models on Apple Hardware: A Unified Path for Deep ML Practice on Core ML?

Submited as : FB16052050 I am looking to adopt Machine Learning in a more granular manner, going beyond just using pre-built Metal, Core ML, or Create ML approaches. Specifically, I want to train models using Open Python PyTorch libraries, as these offer greater flexibility compared to Apple's native tools. However, these PyTorch APIs are primarily optimised for NVIDIA GPUs (or TPUs), not Apple's M3 or Apple Neural Engine (ANE). My goal is to train the models locally without resorting to cloud-based solutions for training or inference, and to then convert the models into Core ML format for deployment on Apple hardware. This would allow me to leverage Apple's hardware acceleration (via ANE, Metal, and MPS) while maintaining control over the training process in PyTorch. I want to know: What are my options for training models in PyTorch on local hardware (Apple M3 or equivalent), and how can I ensure that the PyTorch model can eventually be converted to Core ML without losing flexibility in model training and customisation? How can I perform training in PyTorch and avoid being restricted to inference-only workflows as Core ML typically allows? Is it possible to use the training capabilities of PyTorch and still get the performance benefits of Apple's hardware for both training and inference? What are the best practices or tools to ensure that my training pipeline in PyTorch is compatible with Apple's hardware constraints and optimised for local execution? I'm seeking a practical, cloud-free approach on Apple Hardware only that allows me to train models in PyTorch (keeping control over the training process) while ensuring that they can be deployed efficiently using Core ML on Apple hardware.

Machine Learning & AI General Metal Machine Learning Core ML

1

0

960

Dec ’24

Feasibility of Real-Time Object Detection in Live Video with Core ML on M1 Pro and A-Series Chips

Hello, I am exploring real-time object detection, and its replacement/overlay with another shape, on live video streams for an iOS app using Core ML and Vision frameworks. My target is to achieve high-speed, real-time detection without noticeable latency, similar to what’s possible with PageFault handling and Associative Caching in OS, but applied to video processing. Given that this requires consistent, real-time model inference, I’m curious about how well the Neural Engine or GPU can handle such tasks on A-series chips in iPhones versus M-series chips (specifically M1 Pro and possibly M4) in MacBooks. Here are a few specific points I’d like insight on: Hardware Suitability: How feasible is it to perform real-time object detection with Core ML on the Neural Engine (i.e., can it maintain low latency)? Would the M-series chips (e.g., M1 Pro or newer) offer a tangible benefit for this type of task compared to the A-series in mobile devices? Which A- and M- chips would be minimum feasible recommendation for such task. Performance Expectations: For continuous, live video object detection, what would be the expected frame rate or latency using an optimized Core ML model? Has anyone benchmarked such applications, and is the M-series required to achieve smooth, real-time processing? Differences Across Apple Hardware: How does performance scale between the A-series Neural Engine and M-series GPU and Neural Engine? Is the M-series vastly superior for real-time Core ML tasks like object detection on live video feeds? If anyone has attempted live object detection on these chips, any insights on real-time performance, limitations, or optimizations would be highly appreciated. Please refer: Apple APIs Thank you in advance for your help!

Machine Learning & AI Core ML Machine Learning Core ML Performance Concurrency

3

0

763

Dec ’24

How to Fine-Tune the SNSoundClassifier for Custom Sound Classification in iOS?

Hi Apple Developer Community, I’m exploring ways to fine-tune the SNSoundClassifier to allow users of my iOS app to personalize the model by adding custom sounds or adjusting predictions. While Apple’s WWDC session on sound classification explains how to train from scratch, I’m specifically interested in using SNSoundClassifier as the base model and building/fine-tuning on top of it. Here are a few questions I have: 1. Fine-Tuning on SNSoundClassifier: Is there a way to fine-tune this model programmatically through APIs? The manual approach using macOS, as shown in this documentation is clear, but how can it be done dynamically - within the app for users or in a cloud backend (AWS/iCloud)? Are there APIs or classes that support such on-device/cloud-based fine-tuning or incremental learning? If not directly, can the classifier’s embeddings be used to train a lightweight custom layer? Training is likely computationally intensive and drains too much on battery, doing it on cloud can be right way but need the right apis to get this done. A sample code will do good. 2. Recommended Approach for In-App Model Customization: If SNSoundClassifier doesn’t support fine-tuning, would transfer learning on models like MobileNetV2, YAMNet, OpenL3, or FastViT be more suitable? Given these models (SNSoundClassifier, MobileNetV2, YAMNet, OpenL3, FastViT), which one would be best for accuracy and performance/efficiency on iOS? I aim to maintain real-time performance without sacrificing battery life. Also it is important to see architecture retention and accuracy after conversion to CoreML model. 3. Cost-Effective Backend Setup for Training: Mac EC2 instances on AWS have a 24-hour minimum billing, which can become expensive for limited user requests. Are there better alternatives for deploying and training models on user request when s/he uploads files (training data)? 4. TensorFlow vs PyTorch: Between TensorFlow and PyTorch, which framework would you recommend for iOS Core ML integration? TensorFlow Lite offers mobile-optimized models, but I’m also curious about PyTorch’s performance when converted to Core ML. 5. Metrics: Metrics I have in mind while picking the model are these: Publisher, Accuracy, Fine-Tuning capability, Real-Time/Live use, Suitability of iPhone 16, Architectural retention after coreML conversion, Reasons for unsuitability, Recommended use case. Any insights or recommended approaches would be greatly appreciated. Thanks in advance!

Machine Learning & AI Create ML ML Compute Machine Learning Core ML Create ML

6

1

1.3k

Dec ’24

Source Files from the Session number 424 WWDC2019

In the 2019 WWDC session Training Object Detection Models in Create ML a JSON file named: annotations_832_newdice_copy.json was show alongside with the images folder named: Dice Training Images Two Sets. Are these resources made available for devs ? I am looking to understand whether the 6000 annotations were needed to be done manually ? Meaning, they have annotated around 1000 images making 6 labels on each manually to achieve this source ? Video shows around 1000 images. Can someone please clarify.

Machine Learning & AI Create ML Machine Learning

2

0

631

Dec ’24

VNCoreMLRequest Callback Not Triggered in Modified Video Classification App

Hi everyone, I'm working on integrating object recognition from live video feeds into my existing app by following Apple's sample code. My original project captures video and records it successfully. However, after integrating the Vision-based object detection components (VNCoreMLRequest), no detections occur, and the callback for the request is never triggered. To debug this issue, I’ve added the following functionality: Set up AVCaptureVideoDataOutput for processing video frames. Created a VNCoreMLRequest using my Core ML model. The video recording functionality works as expected, but no object detection happens. I’d like to know: How to debug this further? Which key debug points or logs could help identify where the issue lies? Have I missed any key configurations? Below is a diff of the modifications I’ve made to my project for the new feature. Diff of Changes: (Attach the diff provided above) Specific Observations: The captureOutput method is invoked correctly, but there is no output or error from the Vision request callback. Print statements in my setup function setForVideoClassify() show that the setup executes without errors. Questions: Could this be due to issues with my Core ML model compatibility or configuration? Is the VNCoreMLRequest setup incorrect, or do I need to ensure specific image formats for processing? Platform: Xcode 16.1, iOS 18.1, Swift 5, SwiftUI, iPhone 11, Darwin MacBook-Pro.local 24.1.0 Darwin Kernel Version 24.1.0: Thu Oct 10 21:02:27 PDT 2024; root:xnu-11215.41.3~2/RELEASE_X86_64 x86_64 Any guidance or advice is appreciated! Thanks in advance.

Machine Learning & AI Core ML Machine Learning Vision Concurrency

1

0

581

Nov ’24

CreateML/CoreML Issues with Large Dataset

Hello All, I'm developing a machine learning model for image classification, which requires managing an exceptionally large dataset comprising over 18,000 classes. I've encountered several hurdles while using Create ML, and I would appreciate any insights or advice from those who have faced similar challenges. Current Issues: Create ML Failures with Large Datasets: When using Create ML, the process often fails with errors such as "Failed to create CVPixelBufferPool." This issue appears when handling particularly large volumes of data. Custom Implementation Struggles: To bypass some of the limitations of Create ML, I've developed a custom solution leveraging the MLImageClassifier within the CreateML framework in my own SwiftUI MacOS app. Initially I had similar errors as I did in Create ML, but I discovered I could move beyond the "extracting features" stage without crashing by employing a workaround: using a timer to cancel and restart the job every 30 seconds. This method is the only way I've been able to finish the extraction phase, even with large datasets, but it causes many errors in the console if I allow it to run too long. Lack of Progress Reporting: Using MLJob<MLImageClassifier>, I've noticed that progress reporting stalls after the feature extraction phase. Although system resources indicate activity, there is no programmatic feedback on what is occurring. Things I've Tried: Data Validation: Ensured that all images in the dataset are valid and non-corrupted, which helps prevent unnecessary issues from faulty data. Custom Implementation with CreateML Framework: Developed a custom solution using the MLImageClassifier within the CreateML framework to gain more control over the training process. Timer-Based Workaround: Employed a workaround using a timer to cancel and restart the job every 30 seconds to move past the "extracting features" phase, allowing progress even with larger datasets. Monitoring System Resources: Observed ongoing system resource usage when process feedback stalled, confirming background processing activity despite the lack of progress reporting. Subset Testing: Successfully created and tested a model on a subset of the data, which validated the approach worked for smaller datasets and could produce a functioning model. Router Model Concept: Considered training multiple models for different subsets of data and implementing a "router" model to decide which specialized model to utilize based on input characteristics. What I Need Help With: Handling Large Datasets: I'm seeking strategies or best practices for effectively utilizing Create ML with large datasets. Any guidance on memory management or alternative methodologies would be immensely helpful. Improving Progress Reporting: I'm looking for ways to obtain more consistent and programmatic progress updates during the training and testing phases. I'm working on a Mac M1 Pro w/ 32GB RAM, with Apple Silicon and am fully integrated within the Apple ecosystem. I am very grateful for any advice or experiences you could share to help overcome these challenges. Thank you! I've pasted the relevant code below: func go() { if self.trainingSession == nil { self.trainingSession = createTrainingSession() } if self.startTime == nil { self.startTime = Date() } job = try! MLImageClassifier.resume(self.trainingSession) job.phase .receive(on: RunLoop.main) .sink { phase in self.phase = phase } .store(in: &cancellables) job.checkpoints .receive(on: RunLoop.main) .sink { checkpoint in self.state = "\(checkpoint)\n\(self.job.progress)" self.progress = self.job.progress.fractionCompleted + 0.2 self.updateTimeEstimates() } .store(in: &cancellables) job.result .receive(on: DispatchQueue.main) .sink(receiveCompletion: { completion in switch completion { case .failure(let error): print("Training Failed: \(error.localizedDescription)") case .finished: print("🎉🎉🎉🎉 TRAINING SESSION FINISHED!!!!") self.trainingFinished = true } }, receiveValue: { classifier in Task { await self.saveModel(classifier) } }) .store(in: &cancellables) } private func createTrainingSession() -> MLTrainingSession<MLImageClassifier> { do { print("Initializing training Data...") let trainingData: MLImageClassifier.DataSource = .labeledDirectories(at: trainingDataURL) let modelParameters = MLImageClassifier.ModelParameters( validation: .split(strategy: .automatic), augmentation: self.augmentations, algorithm: .transferLearning( featureExtractor: .scenePrint(revision: 2), classifier: .logisticRegressor ) ) let sessionParameters = MLTrainingSessionParameters( sessionDirectory: self.sessionDirectoryURL, reportInterval: 1, checkpointInterval: 100, iterations: self.numberOfIterations ) print("Initializing training session...") let trainingSession: MLTrainingSession<MLImageClassifier> if FileManager.default.fileExists(atPath: self.sessionDirectoryURL.path) && isSessionCreated(atPath: self.sessionDirectoryURL.path()) { do { trainingSession = try MLImageClassifier.restoreTrainingSession(sessionParameters: sessionParameters) } catch { print("error resuming, exiting.... \(error.localizedDescription)") fatalError() } } else { trainingSession = try MLImageClassifier.makeTrainingSession( trainingData: trainingData, parameters: modelParameters, sessionParameters: sessionParameters ) } return trainingSession } catch { print("Failed to initialize training session: \(error.localizedDescription)") fatalError() } }

Machine Learning & AI Core ML Swift Machine Learning Core ML Create ML

1

750

Nov ’24

Machine Learning

Post

Replies

Boosts

Views

Activity