Can access to SoundAnalysis (sound classifier built into next version of MacOS, iOS, WatchOS) be provided to my app running in the background on iPhone or Apple Watch?
I want to monitor local sounds from Apple Watch and iPhones and take remote action for out of band data (ie. send alert to caregiver if coughing rate is too high, or if someone is knocking on the door for more than a minute, etc.)
General
RSS for tagExplore the power of machine learning within apps. Discuss integrating machine learning features, share best practices, and explore the possibilities for your app.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Created
Did something change on face detection / Vision Framework on iOS 15?
Using VNDetectFaceLandmarksRequest and reading the VNFaceLandmarkRegion2D to detect eyes is not working on iOS 15 as it did before. I am running the exact same code on an iOS 14 and iOS 15 device and the coordinates are different as seen on the screenshot?
Any Ideas?
I am working on the neural network classifier provided on the coremltools.readme.io in the updatable->neural network section(https://coremltools.readme.io/docs/updatable-neural-network-classifier-on-mnist-dataset).
I am using the same code but I get an error saying that the coremltools.converters.keras.convert does not exist. But this I know can be coreml version issue. Right know I am using coremltools version 6.2. I converted this model to mlmodel with .convert only. It got converted successfully.
But I face an error in the make_updatable function saying the loss layer must be softmax output. Even the coremlt package API reference there I found its because the layer name is softmaxND but it should be softmax.
Now the problem is when I convert the model from Keras sequential model to coreml model. the layer name and type change. And the softmax changes to softmaxND.
Does anyone faced this issue?
if I execute this builder.inspect_layers(last=4)
I get this output
[Id: 32], Name: sequential/dense_1/Softmax (Type: softmaxND)
Updatable: False
Input blobs: ['sequential/dense_1/MatMul']
Output blobs: ['Identity']
[Id: 31], Name: sequential/dense_1/MatMul (Type: batchedMatmul)
Updatable: False
Input blobs: ['sequential/dense/Relu']
Output blobs: ['sequential/dense_1/MatMul']
[Id: 30], Name: sequential/dense/Relu (Type: activation)
Updatable: False
Input blobs: ['sequential/dense/MatMul']
Output blobs: ['sequential/dense/Relu']
In the make_updatable function when I execute
builder.set_categorical_cross_entropy_loss(name='lossLayer', input='Identity')
I get this error
ValueError: Categorical Cross Entropy loss layer input (Identity) must be a softmax layer output.
Hello,
I posted an issue on the coremltools GitHub about my Core ML models not performing as well on iOS 17 vs iOS 16 but I'm posting it here just in case.
TL;DR
The same model on the same device/chip performs far slower (doesn't use the Neural Engine) on iOS 17 compared to iOS 16.
Longer description
The following screenshots show the performance of the same model (a PyTorch computer vision model) on an iPhone SE 3rd gen and iPhone 13 Pro (both use the A15 Bionic).
iOS 16 - iPhone SE 3rd Gen (A15 Bioinc)
iOS 16 uses the ANE and results in fast prediction, load and compilation times.
iOS 17 - iPhone 13 Pro (A15 Bionic)
iOS 17 doesn't seem to use the ANE, thus the prediction, load and compilation times are all slower.
Code To Reproduce
The following is my code I'm using to export my PyTorch vision model (using coremltools).
I've used the same code for the past few months with sensational results on iOS 16.
# Convert to Core ML using the Unified Conversion API
coreml_model = ct.convert(
model=traced_model,
inputs=[image_input],
outputs=[ct.TensorType(name="output")],
classifier_config=ct.ClassifierConfig(class_names),
convert_to="neuralnetwork",
# compute_precision=ct.precision.FLOAT16,
compute_units=ct.ComputeUnit.ALL
)
System environment:
Xcode version: 15.0
coremltools version: 7.0.0
OS (e.g. MacOS version or Linux type): Linux Ubuntu 20.04 (for exporting), macOS 13.6 (for testing on Xcode)
Any other relevant version information (e.g. PyTorch or TensorFlow version): PyTorch 2.0
Additional context
This happens across "neuralnetwork" and "mlprogram" type models, neither use the ANE on iOS 17 but both use the ANE on iOS 16
If anyone has a similar experience, I'd love to hear more.
Otherwise, if I'm doing something wrong for the exporting of models for iOS 17+, please let me know.
Thank you!
i'm trying to create an NLModel within a MessageFilterExtension handler.
The code works fine in the main app, but when I try to use it in the extension it fails to initialize. Just this doesn't even work and gets the error below.
Single line that fails.
SMS_Classifier is the class xcode generated for my model. This line works fine in the main app.
let mlModel = try SMS_Classifier(configuration: MLModelConfiguration()).model
Error
Unable to locate Asset for contextual word embedding model for local en.
MLModelAsset: load failed with error Error Domain=com.apple.CoreML Code=0 "initialization of text classifier model with model data failed" UserInfo={NSLocalizedDescription=initialization of text classifier model with model data failed}
Any ideas?
The Translation API introduced at Session 10117 is impressive, but limiting it to SwiftUI is restrictive.
This API works great in the demo, but for more complex apps, it lacks flexibility because it is bound to SwiftUI Views.
Please consider making it available in non-SwiftUI environments.
Topic:
Machine Learning & AI
SubTopic:
General
I'm playing with the new Vision API for iOS18, specifically with the new CalculateImageAestheticsScoresRequest API.
When I try to perform the image observation request I get this error:
internalError("Error Domain=NSOSStatusErrorDomain Code=-1 \"Failed to create espresso context.\" UserInfo={NSLocalizedDescription=Failed to create espresso context.}")
The code is pretty straightforward:
if let image = image {
let request = CalculateImageAestheticsScoresRequest()
Task {
do {
let cgImg = image.cgImage!
let observations = try await request.perform(on: cgImg)
let description = observations.description
let score = observations.overallScore
print(description)
print(score)
} catch {
print(error)
}
}
}
I'm running it on a M2 using the simulator.
Is it a bug? What's wrong?
iOS 18 App Intents while supporting iOS 17
Hello,
I have an existing app that supports iOS 17. I already have three App Intents but would like to add some of the new iOS 18 app intents like ShowInAppSearchResultsIntent.
However, I am having a hard time using #available or @available to limit this ShowInAppSearchResultsIntent to iOS 18 only while still supporting iOS 17.
Obviously, the ShowInAppSearchResultsIntent needs to use @AssistantIntent which is iOS 18 only, so I mark that struct as @available(iOS 18, *). That works as expected. It is when I need to add this "SearchSnippetIntent" intent to the AppShortcutsProvider, that I begin to have trouble doing. See code below:
struct SnippetsShortcutsAppShortcutsProvider: AppShortcutsProvider {
@AppShortcutsBuilder
static var appShortcuts: [AppShortcut] {
//iOS 17+
AppShortcut(intent: SnippetsNewSnippetShortcutsAppIntent(), phrases: [
"Create a New Snippet in \(.applicationName) Studio",
], shortTitle: "New Snippet", systemImageName: "rectangle.fill.on.rectangle.angled.fill")
AppShortcut(intent: SnippetsNewLanguageShortcutsAppIntent(), phrases: [
"Create a New Language in \(.applicationName) Studio",
], shortTitle: "New Language", systemImageName: "curlybraces")
AppShortcut(intent: SnippetsNewTagShortcutsAppIntent(), phrases: [
"Create a New Tag in \(.applicationName) Studio",
], shortTitle: "New Tag", systemImageName: "tag.fill")
//iOS 18 Only
AppShortcut(intent: SearchSnippetIntent(), phrases: [
"Search \(.applicationName) Studio",
"Search \(.applicationName)"
], shortTitle: "Search", systemImageName: "magnifyingglass")
}
let shortcutTileColor: ShortcutTileColor = .blue
}
The iOS 18 Only AppShortcut shows the following error but none of the options seem to work. Maybe I am going about it the wrong way.
'SearchSnippetIntent' is only available in iOS 18 or newer
Add 'if #available' version check
Add @available attribute to enclosing static property
Add @available attribute to enclosing struct
Thanks in advance for your help.
I was working on my project and when I tried to train a model the kernel crashed, so I restarted the kernel and tried the same and still I got the same crashing issue. Then I read one of the thread having the same issue where the apple support was saying to install tensorflow-macos and tensorflow-metal and read the guide from this site:
https://developer.apple.com/metal/tensorflow-plugin/
and I did so, I tried every single thing and when I tried the test code provided in the site, I got the same error, here's the code and the output.
Code:
import tensorflow as tf
cifar = tf.keras.datasets.cifar100
(x_train, y_train), (x_test, y_test) = cifar.load_data()
model = tf.keras.applications.ResNet50(
include_top=True,
weights=None,
input_shape=(32, 32, 3),
classes=100,)
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)
model.compile(optimizer="adam", loss=loss_fn, metrics=["accuracy"])
model.fit(x_train, y_train, epochs=5, batch_size=64)
and here's the output:
Epoch 1/5
The Kernel crashed while executing code in the current cell or a previous cell.
Please review the code in the cell(s) to identify a possible cause of the failure.
Click here for more info.
View Jupyter log for further details.
And here's the half of log file as it was not fully coming:
metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M1
2024-10-06 23:30:49.894405: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 8.00 GB
2024-10-06 23:30:49.894420: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 2.67 GB
2024-10-06 23:30:49.894444: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2024-10-06 23:30:49.894460: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: )
2024-10-06 23:30:56.701461: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.
[libprotobuf FATAL google/protobuf/message_lite.cc:353] CHECK failed: target + size == res:
libc++abi: terminating due to uncaught exception of type google::protobuf::FatalException: CHECK failed: target + size == res:
Please respond to this post as soon as possible as I am working on my project now and getting this error again n again.
Device: Apple MacBook Air M1.
WWDC 2024 mentioned that the OCR feature from the Vision framework has support for "Korean, Swedish, and Chinese", but the Swedish support does not seem to be available...
Running either
print(try? VNRecognizeTextRequest().supportedRecognitionLanguages())
or
var ocrRequest = RecognizeTextRequest(.revision3)
print(ocrRequest.supportedRecognitionLanguages)
did not print out Swedish as one of the supported languages, but Korean and Chinese are.
Tested on early versions of iOS 18 developer beta, and the latest version of iOS 18.1 (22B5054e).
Hi Ty for playing
Hey everyone,
I've been updating my code to take advantage of the new Vision API for text recognition in macOS 15. I'm noticing some very odd behavior though, it seems like in general the new Vision API consistently produces worse results than the old API. For reference here is how I'm setting up my request.
var request = RecognizeTextRequest()
request.recognitionLevel = getOCRMode() // generally accurate
request.usesLanguageCorrection = !disableLanguageCorrection // generally true
request.recognitionLanguages = language.split(separator: ",").map { Locale.Language(identifier: String($0)) } // generally 'en'
let observations = try? await request.perform(on: image) as [RecognizedTextObservation]
Then I will process the results and just get the top candidate, which as mentioned above, typically is of worse quality then the same request formed with the old API.
Am I doing something wrong here?
Almost all the functions in Accelerate are for single precision (Float) and double precision (Double) operations. However, I stumbled upon three integer arithmetic functions which operate on Int32 values. Are there any more functions in Accelerate that operate on integer values? If not, then why aren't there more functions that work with integers?
I am using Apple’s Vision framework with DetectHorizonRequest to detect the horizon in an image. Here is my code:
func processHorizonImage(_ ciImage: CIImage) async {
let request = DetectHorizonRequest()
do {
let result = try await request.perform(on: ciImage)
print(result)
} catch {
print(error)
}
}
After calling the perform method, I am getting result as nil. To ensure the request's correctness, I have verified the following:
The input CIImage is valid and contains a visible horizon.
No errors are being thrown.
The relevant frameworks are properly imported.
Given that my image contains a clear horizon, why am I still not getting any results? I would appreciate any help or suggestions to resolve this issue.
Thank you for your support!
This is the image
Hi All,
Is it possible to record a video using the Object Capture instead of taking a series of pictures ?
Is it possible to get the bounding box coordinates of the object we capture ?
Hello.
I can't find anything about the SSML that is used in Apple's speech synthesis.
SSML from Google, Amazon and W3C either don't work or work incorrectly.
Where is Apple's documentation for their implementation of SSML?
hi,
I am currently running LSTM on TensorFlow. However, when i switched from keras2 to keras3. code running time has increased 10 times -- it seems there is no GPU acceleration.
Here is my code:
batch size = 256
optimiser = adam
activation = tanh
_______________________________________________
Layer (type) Output Shape Param #
=============================================
input_1 (InputLayer) [(None, 7, 16)] 0
bidirectional (Bidirection (None, 7, 320) 226560
al)
bidirectional_1 (Bidirecti (None, 7, 512) 1181696
onal)
bidirectional_2 (Bidirecti (None, 256) 656384
onal)
dense (Dense) (None, 1) 257
==============================================
Total params: 2064897 (7.88 MB)
Trainable params: 2064897 (7.88 MB)
Non-trainable params: 0 (0.00 Byte)
______________________________________________
This is keras 3.6.0 + tensorflow 2.17.0 + tensorflow-metal 1.1.0 training status:
Training------------
Epoch 1/200
28/681 ━━━━━━━━━━━━━━━━━━━━ 8:13 756ms/step - loss: 0.5901 - mape: 338.6876 - mse: 0.8591
This is keras 2.14.0 + tensorflow 2.14.0 + tensorflow-metal 1.1.0 training status:
Training------------
Epoch 1/200
681/681 [==============================] - 37s 49ms/step - loss: 3.6345 - mape: 499038.7500 - mse: 34.4148 - val_loss: 3.5452 - val_mape: 41.7964 - val_mse: 32.0133 - lr: 0.0010
Is that because keras3 has no GPU support on macos?
Apart from that, if I change LSTM activation from tanh to sigmoid in keras2, it does not have GPU support as well.
My system is 15.0.1 and the code was running on python3.11
I am not sure why these happen.
Thanks
I'm using the iOS 18.2 beta on my iPhone 15 Pro Max, but can't find Apple Intelligence, and the Settings app still shows the Old Siri logo.
Is it just me or is early access image playground not available, been waiting for a little over 24hrs and still no access. (no rush for the team if there’s smth wrong) they might be busy rolling out the first few apple intelligence features (ios 18.1) public release.
Topic:
Machine Learning & AI
SubTopic:
General
what am I not understanding here.
in short the view loads text from the jsons descriptions and then should filter out the words. and return and display a list of most used words, debugging shows words being identified by the code but does not filter them out
private func loadWordCounts() {
DispatchQueue.global(qos: .background).async {
let fileManager = FileManager.default
guard let documentsDirectory = try? fileManager.url(for: .documentDirectory, in: .userDomainMask, appropriateFor: nil, create: false) else { return }
let descriptions = loadDescriptions(fileManager: fileManager, documentsDirectory: documentsDirectory)
var counts = countWords(in: descriptions)
let tagsToRemove: Set<NLTag> = [
.verb,
.pronoun,
.determiner,
.particle,
.preposition,
.conjunction,
.interjection,
.classifier
]
for (word, _) in counts {
let tagger = NLTagger(tagSchemes: [.lexicalClass])
tagger.string = word
let (tag, _) = tagger.tag(at: word.startIndex, unit: .word, scheme: .lexicalClass)
if let unwrappedTag = tag, tagsToRemove.contains(unwrappedTag) {
counts[word] = 0
}
}
DispatchQueue.main.async {
self.wordCounts = counts
}
}
}