Post not yet marked as solved
Most examples, including within documentation, of using CoreML with iOS involve the creation of the Model under Xcode on a Mac and then inclusion of the Xcode generated MLFeatureProvider class into the iOS app and (re)compiling the app. However, it’s also possible to download an uncompiled model directly into an iOS app and then compile it (background tasks) - but there’s no MLFeatureProvider class. The same applies when using CreateML in an iOS app (iOS 15 beta) - there’s no automatically generated MLFeatureProvider. So how do you get one? I’ve seen a few queries on here and elsewhere related to this problem, but couldn’t find any clear examples of a solution. So after some experimentation, here’s my take on how to go about it:
Firstly, if you don’t know what features the Model uses, print the model description e.g. print("Model: ",mlModel!.modelDescription). Which gives Model:
inputs: (
"course : String",
"lapDistance : Double",
"cumTime : Double",
"distance : Double",
"lapNumber : Double",
"cumDistance : Double",
"lapTime : Double"
)
outputs: (
"duration : Double"
)
predictedFeatureName: duration
............
A prediction is created by guard **let durationOutput = try? mlModel!.prediction(from: runFeatures) ** ……
where runFeatures is an instance of a class that provides a set of feature names and the value of each feature to be used in making a prediction. So, for my model that predicts run duration from course, lap number, lap time etc the RunFeatures class is:
class RunFeatures : MLFeatureProvider {
var featureNames: Set = ["course","distance","lapNumber","lapDistance","cumDistance","lapTime","cumTime","duration"]
var course : String = "n/a"
var distance : Double = -0.0
var lapNumber : Double = -0.0
var lapDistance : Double = -0.0
var cumDistance : Double = -0.0
var lapTime : Double = -0.0
var cumTime : Double = -0.0
func featureValue(for featureName: String) -> MLFeatureValue? {
switch featureName {
case "distance":
return MLFeatureValue(double: distance)
case "lapNumber":
return MLFeatureValue(double: lapNumber)
case "lapDistance":
return MLFeatureValue(double: lapDistance)
case "cumDistance":
return MLFeatureValue(double: cumDistance)
case "lapTime":
return MLFeatureValue(double: lapTime)
case "cumTime":
return MLFeatureValue(double: cumTime)
case "course":
return MLFeatureValue(string: course)
default:
return MLFeatureValue(double: -0.0)
}
}
}
Then in my DataModel, prior to prediction, I create an instance of RunFeatures with the input values on which I want to base the prediction:
var runFeatures = RunFeatures()
runFeatures.distance = 3566.0
runFeatures.lapNumber = 1.0
runFeatures.lapDistance = 1001.0
runFeatures.lapTime = 468.0
runFeatures.cumTime = 468.0
runFeatures.cumDistance = 1001.0
runFeatures.course = "Wishing Well Loop"
NOTE there’s no need to provide the output feature (“duration”) here, nor in the featureValue method above but it is required in featureNames.
Then get the prediction with guard let durationOutput = try? mlModel!.prediction(from: runFeatures)
Regards,
Michaela
Post not yet marked as solved
Hi,
I try the activity classifier and want to show the results with an app on a IPhone SE.
I try the example analog https://apple.github.io/turicreate/docs/userguide/activity_classifier/export_coreml.html
When calling the prediction function:
...
func performModelPrediction () -> String? {
...
EXC_BAD_ACCESS (code=1, address=0x0) was thrown in line:
let modelPrediction = try! activityClassificationModel.prediction(
^^^^^^^^here was the Error shown
I can trak it down to the source of the mlmodel:
func prediction(input: MyActivityClassifier2Input, options: MLPredictionOptions) throws -> MyActivityClassifier2Output {
let outFeatures = try model.prediction(from: input, options:options)
return MyActivityClassifier2Output(features: outFeatures)
}
I created the model with ml creator, working with Xcode 13 beta.
What I am doing wrong ? Do you have any hints ? Is there a better example for activity classifier ?
Don't hesitate to ask for further details.
-Hans
here my code:
//
// ViewController.swift
// motionstor4yboard
//
// Created by Hans Regler on 19.07.21.
//
import Foundation
import UIKit
import CoreML
import CoreMotion
class ViewController: UIViewController
{
override func viewDidLoad() {
debugPrint("info: start viewDidLoad ... ")
super.viewDidLoad()
// Do any additional setup after loading the view, typically from a nib.
// Connect data:
self.startDeviceMotion()
}
// Define some ML Model constants for the recurrent network
struct ModelConstants {
static let numOfFeatures = 6
// Must be the same value you used while training
static let predictionWindowSize = 100
static let sensorsUpdateInterval = 1.0 / 10.0
static let stateInLength = 400
}
// Initialize the model, layers, and sensor data arrays
let activityClassificationModel = MyActivityClassifier2()
var currentIndexInPredictionWindow = 0
let accelDataX = try! MLMultiArray(shape: [ModelConstants.predictionWindowSize] as [NSNumber], dataType: MLMultiArrayDataType.double)
let accelDataY = try! MLMultiArray(shape: [ModelConstants.predictionWindowSize] as [NSNumber], dataType: MLMultiArrayDataType.double)
let accelDataZ = try! MLMultiArray(shape: [ModelConstants.predictionWindowSize] as [NSNumber], dataType: MLMultiArrayDataType.double)
let gyroDataX = try! MLMultiArray(shape: [ModelConstants.predictionWindowSize] as [NSNumber], dataType: MLMultiArrayDataType.double)
let gyroDataY = try! MLMultiArray(shape: [ModelConstants.predictionWindowSize] as [NSNumber], dataType: MLMultiArrayDataType.double)
let gyroDataZ = try! MLMultiArray(shape: [ModelConstants.predictionWindowSize] as [NSNumber], dataType: MLMultiArrayDataType.double)
var stateOutput = try! MLMultiArray(shape:[ModelConstants.stateInLength as NSNumber], dataType: MLMultiArrayDataType.double)
// Initialize CoreMotion Manager
let motionManager = CMMotionManager()
func startDeviceMotion() {
// guard motionManager.isDeviceMotionAvailable else {
guard motionManager.isAccelerometerAvailable, motionManager.isGyroAvailable else {
debugPrint("Core Motion Data Unavailable!")
return
}
motionManager.accelerometerUpdateInterval = TimeInterval(ModelConstants.sensorsUpdateInterval)
motionManager.gyroUpdateInterval = TimeInterval(ModelConstants.sensorsUpdateInterval)
motionManager.startAccelerometerUpdates(to: .main) { accelerometerData, error in
guard let accelerometerData = accelerometerData else {
print("Error: accelerometerData = accelerometerData")
return }
// Add the current data sample to the data array
self.addAccelSampleToDataArray(accelSample: accelerometerData)
}
}
func addAccelSampleToDataArray (accelSample: CMAccelerometerData) {
// Add the current accelerometer reading to the data array
accelDataX[[currentIndexInPredictionWindow] as [NSNumber]] = accelSample.acceleration.x as NSNumber
accelDataY[[currentIndexInPredictionWindow] as [NSNumber]] = accelSample.acceleration.y as NSNumber
accelDataZ[[currentIndexInPredictionWindow] as [NSNumber]] = accelSample.acceleration.z as NSNumber
// Update the index in the prediction window data array
currentIndexInPredictionWindow += 1
// If the data array is full, call the prediction method to get a new model prediction.
// We assume here for simplicity that the Gyro data was added to the data arrays as well.
if (currentIndexInPredictionWindow == ModelConstants.predictionWindowSize) {
if let predictedActivity = performModelPrediction() {
// Use the predicted activity here
// ...
// Start a new prediction window
currentIndexInPredictionWindow = 0
}
}
}
func performModelPrediction () -> String? {
// Perform model prediction
let modelPrediction = try! activityClassificationModel.prediction(
acceleration_x: accelDataX,
acceleration_y: accelDataY,
acceleration_z: accelDataZ,
rotation_x: gyroDataX,
rotation_y: gyroDataY,
rotation_z: gyroDataZ,
stateIn: stateOutput)
// Update the state vector
stateOutput = modelPrediction.stateOut
// Return the predicted activity - the activity with the highest probability
return modelPrediction.label
}
Post not yet marked as solved
Create a sample project in Creat ML and choose Activity Classifier.
When lowering the sample rate, the preview timeline gets shorter instead of getting longer.
Furthermore, it seems the entire preview timeline breaks (can't scroll) if the sample rate is anything other than 50.
Try it out: Train a model with sample rate 50, then try training it with sample rate 10.
Post not yet marked as solved
Could I please ask what is (at least plainly) the deep learning architecture of the Apple's custom pose models available through Vision (for example with the VNDetectHumanBodyPoseRequest)? Or whether it is based on some publicly used architecture (such as ResNet) only with modifications or custom Apple dataset?
I was not able to find this information anywhere in the Apple documentation and it would be highly beneficial to know this, as we are using this data in a research about which we want to publish a paper.
Thanks beforehand!
I am trying to make an Image Classifier but I keep getting a warning 'init()' is deprecated: Use init(configuration:) instead and handle errors appropriately. I was wondering if it matters because the app gets made and the classifier works. Just wondering what the warning means.
Thanks in advance!
Post not yet marked as solved
I am using a custom CoreML model which takes a MultiArray (Float32 1 × 85 × 60 × 1) as an input. The following is the entire app. As you can see I initialize the model, create some fake data and then run model.predict(data).
import CoreML
struct ContentView: View {
let model: A_4 = try! A_4(configuration: .init())
var body: some View {
VStack {
Text("Test")
.onAppear {
print(model.model)
}
.onTapGesture {
let data = [Float](repeating: 0.3, count: 5100)
let reshapedData = try! MLMultiArray(data).reshaped(to: [1, 85, 60, 1])
let input = A_4Input(conv2d_input: reshapedData)
let prediction = try! model.prediction(input: input)
print(prediction.Identity)
}
}
}
}
On any simulator device running iOS 14.5, model.predict works as expect yielding the following array:
On any number tap: [0.0009301336,0.9990699]
On my real device running iOS 15.0, the model produces a similar result:
On any number tap: [0.0009710453,0.9990289]
Lastly, I tested the app on the following devices, iPhone 11 Pro (iOS 14.7.1), iPhone 11 (iOS 14.6), iPhone SE 2 (iOS 14.6):
On first tap: [0, 1]
On Second tap:
Thread 1: EXC_BAD_ACCESS (code=1, address=0x1470b4000)
As you can see the first prediction is someone nonsense, and the second prediction crashes the app. I have attempted debugging this for quite some time and cannot determine the cause of this memory error. I have provided the ContentView and the model here: https://drive.google.com/drive/folders/11hw70kCfyeuRUepEf6cxhmuTb8oLW3jm?usp=sharing
Thanks for any insight you can provide.
Post not yet marked as solved
Hello, I have an object detection model that I integrated into an app. When I put an image on the preview for the Object Detection File, it classifies the image correctly. However, if I put the same image onto the app, it classifies it differently with different values. I am confused as to how this is happening. Here is my code:
import UIKit
import CoreML
import Vision
import ImageIO
class SecondViewController: UIViewController, UINavigationControllerDelegate {
@IBOutlet weak var photoImageView: UIImageView!
lazy var detectionRequest: VNCoreMLRequest = {
do {
let model = try VNCoreMLModel(for: EarDetection2().model)
let request = VNCoreMLRequest(model: model, completionHandler: { [weak self] request, error in
self?.processDetections(for: request, error: error)
})
request.imageCropAndScaleOption = .scaleFit
return request
} catch {
fatalError("Failed to load Vision ML model: \(error)")
}
}()
@IBAction func testPhoto(_ sender: UIButton) {
let vc = UIImagePickerController()
vc.sourceType = .photoLibrary
vc.delegate = self
present(vc, animated: true)
}
@IBOutlet weak var results: UILabel!
func updateDetections(for image: UIImage) {
let orientation = CGImagePropertyOrientation(rawValue: UInt32(image.imageOrientation.rawValue))
guard let ciImage = CIImage(image: image) else { fatalError("Unable to create \(CIImage.self) from \(image).") }
DispatchQueue.global(qos: .userInitiated).async {
let handler = VNImageRequestHandler(ciImage: ciImage, orientation: orientation!)
do {
try handler.perform([self.detectionRequest])
} catch {
print("Failed to perform detection.\n\(error.localizedDescription)")
}
}
}
func processDetections(for request: VNRequest, error: Error?) {
DispatchQueue.main.async {
guard let results = request.results else {
print("Unable to detect anything.\n\(error!.localizedDescription)")
return
}
let detections = results as! [VNRecognizedObjectObservation]
self.drawDetectionsOnPreview(detections: detections)
}
}
func drawDetectionsOnPreview(detections: [VNRecognizedObjectObservation]) {
guard let image = self.photoImageView?.image else {
return
}
let imageSize = image.size
let scale: CGFloat = 0
UIGraphicsBeginImageContextWithOptions(imageSize, false, scale)
for detection in detections {
image.draw(at: CGPoint.zero)
print(detection.labels.map({"\($0.identifier) confidence: \($0.confidence)"}).joined(separator: "\n"))
print("------------")
results.text = (detection.labels.map({"\($0.identifier) confidence: \($0.confidence)"}).joined(separator: "\n"))
// The coordinates are normalized to the dimensions of the processed image, with the origin at the image's lower-left corner.
let boundingBox = detection.boundingBox
let rectangle = CGRect(x: boundingBox.minX*image.size.width, y: (1-boundingBox.minY-boundingBox.height)*image.size.height, width: boundingBox.width*image.size.width, height: boundingBox.height*image.size.height)
UIColor(red: 0, green: 1, blue: 0, alpha: 0.4).setFill()
UIRectFillUsingBlendMode(rectangle, CGBlendMode.normal)
}
let newImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
self.photoImageView?.image = newImage
}
}
extension SecondViewController: UIImagePickerControllerDelegate {
func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [UIImagePickerController.InfoKey : Any]) {
picker.dismiss(animated: true)
guard let image = info[.originalImage] as? UIImage else {
return
}
self.photoImageView?.image = image
updateDetections(for: image)
}
}
I attached pictures of the model preview and the app preview (it may be hard to tell but they are the same image). I have also attached pictures of my files and storyboard.
Any help would be great!
Thanks in advance!
Post not yet marked as solved
Hey;
I am having issues with a specific CoreML model's initialization under iOS 15
Specifically:
iOS 15 build 19A5318f on iPhone 11
App built with Xcode 13.0 (13A5212g) targeting iOS 13.0
The code basically looks like this
let configuration = MLModelConfiguration()
configuration.computeUnits = .all
let t0 = CACurrentMediaTime()
let model = try MyModelGeneratedClass(configuration: configuration).model
let t1 = CACurrentMediaTime()
print("Loading time: \(t1 - t0)")
When running that code I see a few errors in the console:
Error: Convolution configuration cannot fit in KMEM (Given: 258048b, Max: 65536b)
Error: Convolution configuration cannot fit in KMEM (Given: 258048b, Max: 65536b)
Error: Convolution configuration cannot fit in KMEM (Given: 258048b, Max: 65536b)
Error: Convolution configuration cannot fit in KMEM (Given: 258048b, Max: 65536b)
Error: Convolution configuration cannot fit in KMEM (Given: 258048b, Max: 65536b)
Error: Convolution configuration cannot fit in KMEM (Given: 258048b, Max: 65536b)
Error: Convolution configuration cannot fit in KMEM (Given: 258048b, Max: 65536b)
Error: Convolution configuration cannot fit in KMEM (Given: 258048b, Max: 65536b)
Error: Convolution configuration cannot fit in KMEM (Given: 258048b, Max: 65536b)
and then the app hangs, until finally
Loading time: 116.33786087499993"
(it always is around 2 minutes)
If I change the configuration so that
configuration.computeUnits = .cpuAndGPU
then everything works fine;
Did I miss something ? The CoreML SDK just hanging there for 2 minutes definitely looks like an iOS 15 bug
Post not yet marked as solved
Pardon my ignorance but how does one obtain image classification data to create your model?
Post not yet marked as solved
I am trying to build an app that uses CoreML. However, I would like the data that was used to build the model to grow and the model to predict taking that growth into account. So, at the end of the day the more the user uses the app the smarter the app gets at predicting what the user will select next.
For example:
If the user is presented with a variety of clothes to choose from and the user selects pants, the app will present a list of colors to choose from and let's say the user chooses blue, the next time the user chooses pants the blue color is ranked higher than it was the previous time. Is this possible to do? And how do I make selection updates?
Thanks in advance for any ideas or suggestions.
Post not yet marked as solved
I was following the sentence completion example from the link appleTechTalk so I could learn CoreML for using M1's capabilities without giving up on Pytorch.
The example NLP code written by the Apple Engineer starts around the minute 15:00.
I've typed the exact same code, and it worked fine until I hit the cell where he was performing the coreML conversion. It gave an error that he did not receive during his execution. I checked the code I typed many times and I cannot see any difference between the two. Unfortunately, they did not put the example code anywhere online (at least I failed to find it) so I cannot copy-paste to try out.
What should the error be related to? Is it because of coreML? How can this be fixed so coreML will work?
The code I copied by looking at the video is below:
import torch
import numpy as np
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import coremltools as ct
### Model
class FinishMySentence(torch.nn.Module):
def __init__(self, model=None, eos=198):
super(FinishMySentence, self).__init__()
self.eos = torch.tensor([eos])
self.next_token_predictor = model
self.default_token = torch.tensor([0]) # denotes beginning of a sentence
def forward(self, x):
sentence = x
token = self.default_token
while token != self.eos: # loop/predict until end of sentence token is generated
predictions, _ = self.next_token_predictor(sentence) # takes a list of tokens to predict the next one
token = torch.argmax(predictions[-1, :], dim=0, keepdim=True)
sentence = torch.cat((sentence, token), 0)
return sentence
### Initialize the Token Predictor
token_predictor = GPT2LMHeadModel.from_pretrained("gpt2", torchscript=True).eval()
### Trace the token predictor
random_tokens = torch.randint(10000, (5,))
traced_token_predictor = torch.jit.trace(token_predictor, random_tokens)
### Script the Outer Loop
model = FinishMySentence(model=traced_token_predictor)
scripted_model = torch.jit.script(model)
### Convert to Core ML
# in inputs, give the range for the sequence dimension to be between [1, 64]
mlmodel = ct.convert(scripted_model,
inputs=[ct.TensorType(name="context", shape=(ct.RangeDim(1, 64),), dtype=np.int32)],
)
Output and the error of the last line
Is it possible to use coreML to implement a multi-label text classification?
Post not yet marked as solved
I would like to generate and run ML program inside an app.
I got familiar with the coremltools and MIL format, however I can't seem to find any resources on how to generate mlmodel/mlpackage files using Swift on the device.
Is there any Swift equivalent of coremltools? Or is there a way to translate MIL description of a ML program into instance of a MLModel? Or something similar.
Post not yet marked as solved
I am using a converted custom PyTorch Model on device for use with real time video.
The Model was converted successfully using both CoreMLTools V4.1 and V5.0b3 (both versions exhibit the same issues). When running the model both from a python environment using CoreMLTools, as well as a MacOS app, using the same input image and supplementary data the output is identical, correct and matches the output of the pure PyTorch model.
However, when running it on device, the models output is incorrect. On an iPhone XR, using the .all or .cpuAndGPU value of computeUnits, the output is simply a white square with no error or warning message. What this means is our output, which we normal expect to be in the range of [0,255], has values of 255 in every location. However, running using .cpuOnly on the iPhone XR produces the correct output.
Furthermore, when simulating a device from a MacOS machine, the output is correct regardless of the computeUnits value.
On an iPhone 12 this situation gets even more confusing. With the setting .cpuAndGPU, we get the pure white incorrect output, using .cpuOnly we get the correct output, but with .all we get a different incorrect output, an image of wildly incorrect colors but a vaguely similar form to the image we expect. In addition with the .all setting we get the following error.
2021-09-01 15:07:16.595048-0500 sensoriumViewer[33717:10399075] [espresso] [Espresso::ANERuntimeEngine::__forward_segment 3] evaluate[RealTime]WithModel returned 0; code=5 err=Error Domain=com.apple.appleneuralengine Code=5 "processRequest:qos:qIndex:modelStringID:options:error:: 0xd: Program Inference overflow" UserInfo={NSLocalizedDescription=processRequest:qos:qIndex:modelStringID:options:error:: 0xd: Program Inference overflow}
2021-09-01 15:07:16.595103-0500 sensoriumViewer[33717:10399075] [espresso] [Espresso::overflow_error] /private/var/containers/Bundle/Application/16433631-57DE-488C-8772-D9560C3D8B48/sensoriumViewer.app/SensoriumMLTest16V1.mlmodelc/model.espresso.net:3
Which makes it pretty clear that there is some sort of either Integer or Floating Point overflow error. What I believe is happening is this:
Regardless of model, using the GPU causes the overflow to truncate, giving us values of 255 for all pixels, on the iPhone 12 using .all passes it to the ANE (Apple Neural Engine) which wraps the overflow error, giving unpredictable colors but a kind of correct shape, using .all on the iPhone XR just uses the GPU because for some reason this model won't go to the XR ANE, and lastly, using .cpuOnly does not overflow and gives us the correct result.
Why does the XR not use its ANE for this model? Can the ANE and GPU just not handle 32 bit floats? We are quantizing the model to 16bit using CoreMLTools, why are we still overflowing?
I see the documentation for the new MLProgram format and it seems promising, will that solve this issue?
Is there any documentation surrounding the supported operations and number precision for Pytorch converted models?
Why are there no errors or warnings when passing this through the GPU?
Any help or insight would be greatly appreciated as the documentation I've seen surrounding the ANE is not very comprehensive.
I'm working with a style transfer model trained with pytorch in google colaboratory and then converted to an ML package. When I bring it into xcode and try to preview the asset I see the following error.
There was a problem decoding this Core ML document
missingMetadataField(named: "inputSchema")
I've been able to train and convert models as .mlmodel files, I'm only seeing this issue with .mlpackage files.
I'm using xcode 13 beta, which as far as I know is the only version of xcode that can handle ML packages/programs at the moment, and I'm using the coremltools beta to handle the conversion. Prior to the conversion, or if I convert to an ML Model instead it seems to work just fine.
Is this a problem with how the model is being structured, or converted? Is this a problem with how I've set up my xcode environment/swift project? Is there some way to update the metadata associated with ML packages to make sure the missing input schema is included?
This is both a heads-up to other developers, and a request for workarounds:
I noticed that the CoreML image segmentation model in my app crashes when compiling the app with Xcode 13 and running it on an iOS 14 device.
Compiling the same app and model with Xcode 12 the compiled CoreML model works just fine when running on the same devices.
The crash seems to be caused by CoreML trying to access the MLShapedArray type, which was only introduced in iOS 15 (see stack trace in screenshot).
So if you have an app using CoreML, make sure you test on iOS 14 devices before submitting a new build using Xcode 13.
For Apple folks, I filed a Feedback on this a little while ago (and of course heard nothing back). The ID is 9584636.
I had hoped this issue would get fixed before the Xcode 13 release, but unfortunately it is still there in the GM build.
Any workarounds (aside from 'keep using Xcode 12') would be appreciated.
Post not yet marked as solved
Is it a best practice to convert AI models to Core ML? What if the model is in the format of tflite (Tensorflow Lite)? Is it really necessary to convert it to Core ML before using it in the app?
Got the following error when added the --encrypt flag to the build phase for my .coreml model file.
coremlc: error: generate command model encryption is not supported on the specific deployment target macos.
Any insights would be appreciated. Thanks.
Post not yet marked as solved
With the release of Xcode 13, a large section of my vision framework processing code became errors and cannot compile. All of these have became deprecated.
This is my original code:
do {
// Perform VNDetectHumanHandPoseRequest
try handler.perform([handPoseRequest])
// Continue only when a hand was detected in the frame.
// Since we set the maximumHandCount property of the request to 1, there will be at most one observation.
guard let observation = handPoseRequest.results?.first else {
self.state = "no hand"
return
}
// Get points for thumb and index finger.
let thumbPoints = try observation.recognizedPoints(forGroupKey: .handLandmarkRegionKeyThumb)
let indexFingerPoints = try observation.recognizedPoints(forGroupKey: .handLandmarkRegionKeyIndexFinger)
let middleFingerPoints = try observation.recognizedPoints(forGroupKey: .handLandmarkRegionKeyMiddleFinger)
let ringFingerPoints = try observation.recognizedPoints(forGroupKey: .handLandmarkRegionKeyRingFinger)
let littleFingerPoints = try observation.recognizedPoints(forGroupKey: .handLandmarkRegionKeyLittleFinger)
let wristPoints = try observation.recognizedPoints(forGroupKey: .all)
// Look for tip points.
guard let thumbTipPoint = thumbPoints[.handLandmarkKeyThumbTIP],
let thumbIpPoint = thumbPoints[.handLandmarkKeyThumbIP],
let thumbMpPoint = thumbPoints[.handLandmarkKeyThumbMP],
let thumbCMCPoint = thumbPoints[.handLandmarkKeyThumbCMC] else {
self.state = "no tip"
return
}
guard let indexTipPoint = indexFingerPoints[.handLandmarkKeyIndexTIP],
let indexDipPoint = indexFingerPoints[.handLandmarkKeyIndexDIP],
let indexPipPoint = indexFingerPoints[.handLandmarkKeyIndexPIP],
let indexMcpPoint = indexFingerPoints[.handLandmarkKeyIndexMCP] else {
self.state = "no index"
return
}
guard let middleTipPoint = middleFingerPoints[.handLandmarkKeyMiddleTIP],
let middleDipPoint = middleFingerPoints[.handLandmarkKeyMiddleDIP],
let middlePipPoint = middleFingerPoints[.handLandmarkKeyMiddlePIP],
let middleMcpPoint = middleFingerPoints[.handLandmarkKeyMiddleMCP] else {
self.state = "no middle"
return
}
guard let ringTipPoint = ringFingerPoints[.handLandmarkKeyRingTIP],
let ringDipPoint = ringFingerPoints[.handLandmarkKeyRingDIP],
let ringPipPoint = ringFingerPoints[.handLandmarkKeyRingPIP],
let ringMcpPoint = ringFingerPoints[.handLandmarkKeyRingMCP] else {
self.state = "no ring"
return
}
guard let littleTipPoint = littleFingerPoints[.handLandmarkKeyLittleTIP],
let littleDipPoint = littleFingerPoints[.handLandmarkKeyLittleDIP],
let littlePipPoint = littleFingerPoints[.handLandmarkKeyLittlePIP],
let littleMcpPoint = littleFingerPoints[.handLandmarkKeyLittleMCP] else {
self.state = "no little"
return
}
guard let wristPoint = wristPoints[.handLandmarkKeyWrist] else {
self.state = "no wrist"
return
}
...
}
Now every line from thumbPoints onwards results in error, I have fixed the first part (not sure if it is correct or not as it cannot compile) to :
let thumbPoints = try observation.recognizedPoints(forGroupKey: VNHumanHandPoseObservation.JointsGroupName.thumb.rawValue)
let indexFingerPoints = try observation.recognizedPoints(forGroupKey: VNHumanHandPoseObservation.JointsGroupName.indexFinger.rawValue)
let middleFingerPoints = try observation.recognizedPoints(forGroupKey: VNHumanHandPoseObservation.JointsGroupName.middleFinger.rawValue)
let ringFingerPoints = try observation.recognizedPoints(forGroupKey: VNHumanHandPoseObservation.JointsGroupName.ringFinger.rawValue)
let littleFingerPoints = try observation.recognizedPoints(forGroupKey: VNHumanHandPoseObservation.JointsGroupName.littleFinger.rawValue)
let wristPoints = try observation.recognizedPoints(forGroupKey: VNHumanHandPoseObservation.JointsGroupName.littleFinger.rawValue)
I tried many different things but just could not get the retrieving individual points to work. Can anyone help on fixing this?
Post not yet marked as solved
I trained a UNet model using Pytorch and used coremltools to convert it to the CoreML format.
I found inconsistent results after the conversion. The CoreML model predicts all black masks while the results of pytorch model look fine. I have confirmed that there is no preprocessing done as part of the model training/trace process.
Can someone help with this please?
Thank you!!