Can access to SoundAnalysis (sound classifier built into next version of MacOS, iOS, WatchOS) be provided to my app running in the background on iPhone or Apple Watch?
I want to monitor local sounds from Apple Watch and iPhones and take remote action for out of band data (ie. send alert to caregiver if coughing rate is too high, or if someone is knocking on the door for more than a minute, etc.)
Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.
Post
Replies
Boosts
Views
Activity
Did something change on face detection / Vision Framework on iOS 15?
Using VNDetectFaceLandmarksRequest and reading the VNFaceLandmarkRegion2D to detect eyes is not working on iOS 15 as it did before. I am running the exact same code on an iOS 14 and iOS 15 device and the coordinates are different as seen on the screenshot?
Any Ideas?
in iOS 15, on stopSpeaking of AVSpeechSynthesizer,
didFinish delegate method getting called instead of didCancel which is working fine in iOS 14 and below version.
Hi,
I have a custom object detection CoreML model and I notice something strange when using the model with the Vision framework.
I have tried two different approaches as to how to process an image and do inference on the CoreML model.
The first one is using the CoreML "raw": initialising the model, getting the input image ready and using the model's .prediction() function to get the models output.
The second one is using Vision to wrap the CoreML model in a VNCoreMLModel, creating a VNCoreMLRequest and using the VNImageRequestHandler to actually perform the model inference. The result of the VNCoreMLRequest is of type VNRecognizedObjectObservation.
The issue I now face is in the difference in the output of both methods. The first method gives back the raw output of the CoreML model: confidence and coordinates. The confidence is an array with size equal to the number of classes in my model (3 in my case). The second method gives back the boundingBox, confidence and labels. However here the confidence is only the confidence for the most likely class (so size is equal to 1). But the confidence I get from the second approach is quite different from the confidence I get during the first approach.
I can use either one of the approaches in my application. However, I really want to find out what is going on and understand how this difference occurred.
Thanks!
Hi,
as showed in the course I created the PyTorch model sample and want to export / convert this model o a CoreML iOS Model using the coremltools. Input is a 224x224 image and output is a image classification (3 different classes)
I am using coremltools for this with this code:
import coremltools as ct
modelml = ct.convert(
scripted_model,
inputs=[ct.ImageType(shape=(1,3,224,244))]
)
I have a working iOS App code which performs with another model which was created using Microsoft Azure Vision.
The PyTorch exported model is loaded and a prediction is performed, but I am getting this error:
Foundation.MonoTouchException: Objective-C exception thrown. Name: NSInvalidArgumentException Reason: -[VNCoreMLFeatureValueObservation identifier]: unrecognized selector sent to instance 0x2805dd3b0
When I check the exported model with Xcode and compare it with another model which is working with the sample iOS App code (created and exported from Microsoft Azure) I can see that the input (for image classification using the device camera) seems ok and is equal, but the output is totally different. (see screenshots)
The working model has two outputs:
loss => Dictionary (String => Double)
classLabel => String
My exported model using coremltools just has one export:
MultiArray(Float32) (name var_1620, I think this is the last feature layer output of the EfficentNetB2)
How do I change my model or my coremltools export to get the correct output for the prediction ?
I read the coreml documentation (https://coremltools.readme.io/docs/pytorch-conversion) and tried some GitHub samples.
But I never get the correct output.
How do I export the PyTorch model so that the output is correct and the prediction will work ?
Best
Marco
I am working on the neural network classifier provided on the coremltools.readme.io in the updatable->neural network section(https://coremltools.readme.io/docs/updatable-neural-network-classifier-on-mnist-dataset).
I am using the same code but I get an error saying that the coremltools.converters.keras.convert does not exist. But this I know can be coreml version issue. Right know I am using coremltools version 6.2. I converted this model to mlmodel with .convert only. It got converted successfully.
But I face an error in the make_updatable function saying the loss layer must be softmax output. Even the coremlt package API reference there I found its because the layer name is softmaxND but it should be softmax.
Now the problem is when I convert the model from Keras sequential model to coreml model. the layer name and type change. And the softmax changes to softmaxND.
Does anyone faced this issue?
if I execute this builder.inspect_layers(last=4)
I get this output
[Id: 32], Name: sequential/dense_1/Softmax (Type: softmaxND)
Updatable: False
Input blobs: ['sequential/dense_1/MatMul']
Output blobs: ['Identity']
[Id: 31], Name: sequential/dense_1/MatMul (Type: batchedMatmul)
Updatable: False
Input blobs: ['sequential/dense/Relu']
Output blobs: ['sequential/dense_1/MatMul']
[Id: 30], Name: sequential/dense/Relu (Type: activation)
Updatable: False
Input blobs: ['sequential/dense/MatMul']
Output blobs: ['sequential/dense/Relu']
In the make_updatable function when I execute
builder.set_categorical_cross_entropy_loss(name='lossLayer', input='Identity')
I get this error
ValueError: Categorical Cross Entropy loss layer input (Identity) must be a softmax layer output.
I need a simple text-to-speech avatar in my iOS app. iOS already has Memojis ready to go - but I cannot find anywhere in the dev docs on how to access Memojis to use in as a tool in app development. Am I missing something? Also - can anyone point me to any resources besides the Apple docs for using AVSpeechSynthesis?
Hi everyone, I might need some help with on-device recognition. It seems that the speech recognition task will discard whatever it has transcribed after a new sentence starts (or it believes it becomes a new sentence) during a single audio session, with requiresOnDeviceRecognition is set to true.
This doesn't happen with requiresOnDeviceRecognition set to false.
System environment: macOS 14 with Xcode 15, deploying to iOS 17
Thank you all!
I want to add shortcut and Siri support using the new AppIntents framework. Running my intent using shortcuts or from spotlight works fine, as the touch based UI for the disambiguation is shown. However, when I ask Siri to perform this action, she gets into a loop of asking me the question to set the parameter.
My AppIntent is implemented as following:
struct StartSessionIntent: AppIntent {
static var title: LocalizedStringResource = "start_recording"
@Parameter(title: "activity", requestValueDialog: IntentDialog("which_activity"))
var activity: ActivityEntity
@MainActor
func perform() async throws -> some IntentResult & ProvidesDialog {
let activityToSelect: ActivityEntity = self.activity
guard let selectedActivity = Activity[activityToSelect.name] else {
return .result(dialog: "activity_not_found")
}
...
return .result(dialog: "recording_started \(selectedActivity.name.localized())")
}
}
The ActivityEntity is implemented like this:
struct ActivityEntity: AppEntity {
static var typeDisplayRepresentation = TypeDisplayRepresentation(name: "activity")
typealias DefaultQuery = MobilityActivityQuery
static var defaultQuery: MobilityActivityQuery = MobilityActivityQuery()
var id: String
var name: String
var icon: String
var displayRepresentation: DisplayRepresentation {
DisplayRepresentation(title: "\(self.name.localized())", image: .init(systemName: self.icon))
}
}
struct MobilityActivityQuery: EntityQuery {
func entities(for identifiers: [String]) async throws -> [ActivityEntity] {
Activity.all()?.compactMap({ activity in
identifiers.contains(where: { $0 == activity.name }) ? ActivityEntity(id: activity.name, name: activity.name, icon: activity.icon) : nil
}) ?? []
}
func suggestedEntities() async throws -> [ActivityEntity] {
Activity.all()?.compactMap({ activity in
ActivityEntity(id: activity.name, name: activity.name, icon: activity.icon)
}) ?? []
}
}
Has anyone an idea what might be causing this and how I can fix this behavior? Thanks in advance
Hi.
A17 Pro Neural Engine has 35 TOPS computational power.
But many third-party benchmarks and articles suggest that it has a little more power than A16 Bionic.
Some references are,
Geekbench ML
Core ML performance benchmark, 2023 edition
How do we use the maximum power of A17 Pro Neural Engine?
For example, I guess that logical devices of ANE on A17 Pro may be two, not one, so we may need to instantiate two Core ML models simultaneously for the purpose.
Please let me know any technical hints.
When I add AppEnity to my model, I receive this error that is still repeated for each attribute in the model. The models are already marked for Widget Extension in Target Membership. I have already cleaned and restarted, nothing works. Will anyone know what I'm doing wrong?
Unable to find matching source file for path "@_swiftmacro_21HabitWidgetsExtension0A05ModelfMm.swift"
import SwiftData
import AppIntents
enum FrecuenciaCumplimiento: String, Codable {
case diario
case semanal
case mensual
}
@Model
final class Habit: AppEntity {
@Attribute(.unique) var id: UUID
var nombre: String
var descripcion: String
var icono: String
var color: String
var esHabitoPositivo: Bool
var valorObjetivo: Double
var unidadObjetivo: String
var frecuenciaCumplimiento: FrecuenciaCumplimiento
static var typeDisplayRepresentation: TypeDisplayRepresentation = "Hábito"
static var defaultQuery = HabitQuery()
var displayRepresentation: DisplayRepresentation {
DisplayRepresentation(title: "\(nombre)")
}
static var allHabits: [Habit] = [
Habit(id: UUID(), nombre: "uno", descripcion: "", icono: "circle", color: "#BF0000", esHabitoPositivo: true, valorObjetivo: 1.0, unidadObjetivo: "", frecuenciaCumplimiento: .mensual),
Habit(id: UUID(), nombre: "dos", descripcion: "", icono: "circle", color: "#BF0000", esHabitoPositivo: true, valorObjetivo: 1.0, unidadObjetivo: "", frecuenciaCumplimiento: .mensual)
]
/*
static func loadAllHabits() async throws {
do {
let modelContainer = try ModelContainer(for: Habit.self)
let descriptor = FetchDescriptor<Habit>()
allHabits = try await modelContainer.mainContext.fetch(descriptor)
} catch {
// Manejo de errores si es necesario
print("Error al cargar hábitos: \(error)")
throw error
}
}
*/
init(id: UUID = UUID(), nombre: String, descripcion: String, icono: String, color: String, esHabitoPositivo: Bool, valorObjetivo: Double, unidadObjetivo: String, frecuenciaCumplimiento: FrecuenciaCumplimiento) {
self.id = id
self.nombre = nombre
self.descripcion = descripcion
self.icono = icono
self.color = color
self.esHabitoPositivo = esHabitoPositivo
self.valorObjetivo = valorObjetivo
self.unidadObjetivo = unidadObjetivo
self.frecuenciaCumplimiento = frecuenciaCumplimiento
}
@Relationship(deleteRule: .cascade)
var habitRecords: [HabitRecord] = []
}
struct HabitQuery: EntityQuery {
func entities(for identifiers: [Habit.ID]) async throws -> [Habit] {
//try await Habit.loadAllHabits()
return Habit.allHabits.filter { identifiers.contains($0.id) }
}
func suggestedEntities() async throws -> [Habit] {
//try await Habit.loadAllHabits()
return Habit.allHabits// .filter { $0.isAvailable }
}
func defaultResult() async -> Habit? {
try? await suggestedEntities().first
}
}
Hello,
We all face issues with the latest tensorflow gpu. Incorrect result, errors etc... We all agreed to pay extra for the M1/2/3 so we could work on a professional grade computer but in the end we must use CPU. When will apple actually comment on that and provide updates. I totally understand these issues aren't fixed overnight and take some time, but i've never seen any apple dev answer saying that they understand and they're working on a fix.
I've basically bought a Mac M3 Pro to be able to run on GPU some stuff without having to purchase a server and it's now useless. It's really frustrating.
I've only been using this late 2021 MBP 16 for nearly 2 years, and now the speaker is producing a crackling sound. Upon inquiring about repairs, customer service informed me that it would cost $728 to replace the speaker, which is a third of the price of the laptop itself. It's absolutely absurd that a $2200 laptop's speaker would fail within such a short period without any external damage. The repair cost being a third of the laptop's price is outrageous. I intend to initiate a petition in the US, hoping to connect with others experiencing the same problem. This is indicative of a subpar product, and customers shouldn't bear the burden of Apple's shortcomings. I plan to share my grievances on various social media platforms and if the issue persists, I will escalate it to the media for further exposure.
In investigating a capture session crash, it's unclear what's causing occasional system pressure interruptions, except that it's happening on older iOS devices. Does Low Power Mode have a meaningful impact on whether these interruptions happen?
I'm working with MLSoundClassifier to try to look for 2 different sounds in a live audio stream. I have been debating with the team if it is better to train 2 separate models, one for each different sound, or train 1 model on both sounds? Has anyone had any experience with this. Some of us believe that we have received better results with the separate models and some with 1 single model trained on both sounds. Thank you!
Hi i am trying to set up tensorflow-metal as instructed by https://developer.apple.com/metal/tensorflow-plugin/
when running line (python -m pip install tensorflow-metal) I get the following error:
ERROR: Could not find a version that satisfies the requirement tensorflow-metal (from versions: none)
ERROR: No matching distribution found for tensorflow-metal
According to the troubleshooting section: "Check that the Python version used in the environment is supported (Python 3.8, Python 3.9, Python 3.10)." My current version is Python 3.9.12.
Any insight would be great!
Can you use View with Transferable View in the one WindowGroup to another
ImmersiveSpace with RealityView?
I can drag, but the drop event isn't captured when with RealityView
var body: some View {
let droppable = Droppable( model: model )
RealityView { content in
// Add the initial RealityKit content
content.add(floorEntity)
}
.onDrop( of: ...
// or
.dropDestination( For ... {}
//or
.gesture( DragGesture()
.targetedToAnyEntity()
.onChanged({ value in
none of them triggers the drop
Hi,
I am looking for a routine to perform complex-valued linear algebra on the GPU in python for scientific programming, in particular quantum physics simulations.
At the moment I am looking for a routine for complex-valued matrix multiplication. I found MLX has a routine for float matrix multiplication, but it does not directly work for complex-valued matrices. I figured a work-around by splitting the complex valued matrix into real and imaginary part and working with the pair, but it makes it cumbersome to integrate with the remainder of the code. I was hoping for a library-based implementation similar to cupy.
I also tried out using the tensorflow linear algebra routines, but I couldn't get them to run on the GPU by now. Specifically, a testfile with a tensorflow.keras.applications.ResNet50 routine runs on the GPU, but the routines from tensorflow.linalg and tensorflow.math that I tested (matmul, expm, eigh) were not running on the GPU.
Any advice on how to make linear algebra calculations on mac GPUs work is highly appreciated! For my application the unified memory might be especially beneficial.
Thank you!
Xcode 15.3 AppIntentsSSUTraining warning: missing the definition of locale # variables.1.definitions
Hello!
I've noticed that adding localizations for AppShortcuts triggers the following warnings in Xcode 15.3:
warning: missing the definition of zh-Hans # variables.1.definitions
warning: missing the definition of zh-Hans # variables.2.definitions
This occurs with both legacy strings files and String Catalogs.
Example project: https://github.com/gongzhang/AppShortcutsLocalizationWarningExample
Hello,
I can see many apps that analyzes sound from microphone in real time. Is there another library like Audiokit or all of them are made with Audiokit??
Thanks