Explore the power of machine learning within apps. Discuss integrating machine learning features, share best practices, and explore the possibilities for your app.

Posts under General subtopic

Post

Replies

Boosts

Views

Activity

Named Entity Recognition Model for Measurements
In an under-development MacOS & iOS app, I need to identify various measurements from OCR'ed text: length, weight, counts per inch, area, percentage. The unit type (e.g. UnitLength) needs to be identified as well as the measurement's unit (e.g. .inches) in order to convert the measurement to the app's internal standard (e.g. centimetres), the value of which is stored the relevant CoreData entity. The use of NLTagger and NLTokenizer is problematic because of the various representations of the measurements: e.g. "50g.", "50 g", "50 grams", "1 3/4 oz." Currently, I use a bespoke algorithm based on String contains and step-wise evaluation of characters, which is reasonably accurate but requires frequent updating as further representations are detected. I'm aware of the Python SpaCy model being capable of NER Measurement recognition, but am reluctant to incorporate a Python-based solution into a production app. (ref [https://developer.apple.com/forums/thread/30092]) My preference is for an open-source NER Measurement model that can be used as, or converted to, some form of a Swift compatible Machine Learning model. Does anyone know of such a model?
0
0
89
Mar ’25
DockKit .track() has no effect using VNDetectFaceRectanglesRequest
Hi, I'm testing DockKit with a very simple setup: I use VNDetectFaceRectanglesRequest to detect a face and then call dockAccessory.track(...) using the detected bounding box. The stand is correctly docked (state == .docked) and dockAccessory is valid. I'm calling .track(...) with a single observation and valid CameraInformation (including size, device, orientation, etc.). No errors are thrown. To monitor this, I added a logging utility – track(...) is being called 10–30 times per second, as recommended in the documentation. However: the stand does not move at all. There is no visible reaction to the tracking calls. Is there anything I'm missing or doing wrong? Is VNDetectFaceRectanglesRequest supported for DockKit tracking, or are there hidden requirements? Would really appreciate any help or pointers – thanks! That's my complete code: extension VideoFeedViewController: AVCaptureVideoDataOutputSampleBufferDelegate { func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) { guard let frame = CMSampleBufferGetImageBuffer(sampleBuffer) else { return } detectFace(image: frame) func detectFace(image: CVPixelBuffer) { let faceDetectionRequest = VNDetectFaceRectanglesRequest() { vnRequest, error in guard let results = vnRequest.results as? [VNFaceObservation] else { return } guard let observation = results.first else { return } let boundingBoxHeight = observation.boundingBox.size.height * 100 #if canImport(DockKit) if let dockAccessory = self.dockAccessory { Task { try? await trackRider( observation.boundingBox, dockAccessory, frame, sampleBuffer ) } } #endif } let imageResultHandler = VNImageRequestHandler(cvPixelBuffer: image, orientation: .up) try? imageResultHandler.perform([faceDetectionRequest]) func combineBoundingBoxes(_ box1: CGRect, _ box2: CGRect) -> CGRect { let minX = min(box1.minX, box2.minX) let minY = min(box1.minY, box2.minY) let maxX = max(box1.maxX, box2.maxX) let maxY = max(box1.maxY, box2.maxY) let combinedWidth = maxX - minX let combinedHeight = maxY - minY return CGRect(x: minX, y: minY, width: combinedWidth, height: combinedHeight) } #if canImport(DockKit) func trackObservation(_ boundingBox: CGRect, _ dockAccessory: DockAccessory, _ pixelBuffer: CVPixelBuffer, _ cmSampelBuffer: CMSampleBuffer) throws { // Zähle den Aufruf TrackMonitor.shared.trackCalled() let invertedBoundingBox = CGRect( x: boundingBox.origin.x, y: 1.0 - boundingBox.origin.y - boundingBox.height, width: boundingBox.width, height: boundingBox.height ) guard let device = captureDevice else { fatalError("Kamera nicht verfügbar") } let size = CGSize(width: Double(CVPixelBufferGetWidth(pixelBuffer)), height: Double(CVPixelBufferGetHeight(pixelBuffer))) var cameraIntrinsics: matrix_float3x3? = nil if let cameraIntrinsicsUnwrapped = CMGetAttachment( sampleBuffer, key: kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix, attachmentModeOut: nil ) as? Data { cameraIntrinsics = cameraIntrinsicsUnwrapped.withUnsafeBytes { $0.load(as: matrix_float3x3.self) } } Task { let orientation = getCameraOrientation() let cameraInfo = DockAccessory.CameraInformation( captureDevice: device.deviceType, cameraPosition: device.position, orientation: orientation, cameraIntrinsics: cameraIntrinsics, referenceDimensions: size ) let observation = DockAccessory.Observation( identifier: 0, type: .object, rect: invertedBoundingBox ) let observations = [observation] guard let image = CMSampleBufferGetImageBuffer(sampleBuffer) else { print("no image") return } do { try await dockAccessory.track(observations, cameraInformation: cameraInfo) } catch { print(error) } } } #endif func clearDrawings() { boundingBoxLayer?.removeFromSuperlayer() boundingBoxSizeLayer?.removeFromSuperlayer() } } } } @MainActor private func getCameraOrientation() -> DockAccessory.CameraOrientation { switch UIDevice.current.orientation { case .portrait: return .portrait case .portraitUpsideDown: return .portraitUpsideDown case .landscapeRight: return .landscapeRight case .landscapeLeft: return .landscapeLeft case .faceDown: return .faceDown case .faceUp: return .faceUp default: return .corrected } }
0
0
59
Mar ’25
Keras on Mac (M4) is giving inconsistent results compared to running on NVIDIA GPUs
I have seen inconsistent results for my Colab machine learning notebooks running locally on a Mac M4, compared to running the same notebook code on either T4 (in Colab) or a RTX3090 locally. To illustrate the problems I have set up a notebook that implements two simple CNN models that solves the Fashion-MNIST problem. https://colab.research.google.com/drive/11BhtHhN079-BWqv9QvvcSD9U4mlVSocB?usp=sharing For the good model with 2M parameters I get the following results: T4 (Colab, JAX): Test accuracy: 0.925 3090 (Local PC via ssh tunnel, Jax): Test accuracy: 0.925 Mac M4 (Local, JAX): Test accuracy: 0.893 Mac M4 (Local, Tensorflow): Test accuracy: 0.893 That is, I see a significant drop in performance when I run on the Mac M4 compared to the NVIDIA machines, and it seems to be independent of backend. I however do not know how to pinpoint this to either Keras or Apple’s METAL implementation. I have reported this to Keras: https://colab.research.google.com/drive/11BhtHhN079-BWqv9QvvcSD9U4mlVSocB?usp=sharing but as this can be (likely is?) an Apple Metal issue, I wanted to report this here as well. On the mac I am running the following Python libraries: keras 3.9.1 tensorflow 2.19.0 tensorflow-metal 1.2.0 jax 0.5.3 jax-metal 0.1.1 jaxlib 0.5.3
0
0
81
Mar ’25
Vision Framework VNTrackObjectRequest: Minimum Valid Bounding Box Size Causing Internal Error (Code=9)
I'm developing a tennis ball tracking feature using Vision Framework in Swift, specifically utilizing VNDetectedObjectObservation and VNTrackObjectRequest. Occasionally (but not always), I receive the following runtime error: Failed to perform SequenceRequest: Error Domain=com.apple.Vision Code=9 "Internal error: unexpected tracked object bounding box size" UserInfo={NSLocalizedDescription=Internal error: unexpected tracked object bounding box size} From my investigation, I suspect the issue arises when the bounding box from the initial observation (VNDetectedObjectObservation) is too small. However, Apple's documentation doesn't clearly define the minimum bounding box size that's considered valid by VNTrackObjectRequest. Could someone clarify: What is the minimum acceptable bounding box width and height (normalized) that Vision Framework's VNTrackObjectRequest expects? Is there any recommended practice or official guidance for bounding box size validation before creating a tracking request? This information would be extremely helpful to reliably avoid this internal error. Thank you!
0
0
81
Apr ’25
AI and ML
Hello. I am willing to hire game developer for cards game called baloot. My question is Can the developer implement an AI when the computer is playing and the computer on the same time the conputer improves his rises level without any interaction? 🌹
0
0
61
Jun ’25
Is there anywhere to get precompiled WhisperKit models for Swift?
If try to dynamically load WhipserKit's models, as in below, the download never occurs. No error or anything. And at the same time I can still get to the huggingface.co hosting site without any headaches, so it's not a blocking issue. let config = WhisperKitConfig( model: "openai_whisper-large-v3", modelRepo: "argmaxinc/whisperkit-coreml" ) So I have to default to the tiny model as seen below. I have tried so many ways, using ChatGPT and others, to build the models on my Mac, but too many failures, because I have never dealt with builds like that before. Are there any hosting sites that have the models (small, medium, large) already built where I can download them and just bundle them into my project? Wasted quite a large amount of time trying to get this done. import Foundation import WhisperKit @MainActor class WhisperLoader: ObservableObject { var pipe: WhisperKit? init() { Task { await self.initializeWhisper() } } private func initializeWhisper() async { do { Logging.shared.logLevel = .debug Logging.shared.loggingCallback = { message in print("[WhisperKit] \(message)") } let pipe = try await WhisperKit() // defaults to "tiny" self.pipe = pipe print("initialized. Model state: \(pipe.modelState)") guard let audioURL = Bundle.main.url(forResource: "44pf", withExtension: "wav") else { fatalError("not in bundle") } let result = try await pipe.transcribe(audioPath: audioURL.path) print("result: \(result)") } catch { print("Error: \(error)") } } }
0
0
75
Jun ’25
SpeechTranscriber time indexes - detect pauses?
I'm experimenting with the new SpeechTranscriber in macOS/iOS 26, transcribing speech from a prerecorded mp4 file. Speed and quality are amazing! I've told the transcriber to include time indexes. Each run is always exactly one word, which can be very useful. When I look at the indexes the end of one run is always identical to the start of the next run, even if there's a pause. I'd like to identify pauses, perhaps to generate something like phrases for subtitling. With each run of text going into the next I can't do this, other than using punctuation - which might be rather rough. Any suggestions on detecting pauses, or getting that kind of metadata from the transcriber? Here's a short sample, showing each run with the start, end, and characters in the run: 105.9 --> 107.04 I 107.04 --> 107.16 think 107.16 --> 108.0 more 108.0 --> 108.42 lighting 108.42 --> 108.6 is 108.6 --> 108.72 definitely 108.72 --> 109.2 needed, 109.2 --> 109.92 downtown. 109.98 --> 110.4 My 110.4 --> 110.52 only 110.52 --> 110.7 question 110.7 --> 111.06 is, 111.06 --> 111.48 poll 111.48 --> 111.78 five, 111.78 --> 111.84 that 111.84 --> 112.08 you're 112.08 --> 112.38 increasing 112.38 --> 112.5 the 112.5 --> 113.34 50,000? 113.4 --> 113.58 Where 113.58 --> 113.88 exactly
0
0
135
Jun ’25
AttributedString in App Intents
In this WWDC25 session, it is explictely mentioned that apps should support AttributedString for text parameters to their App Intents. However, I have not gotten this to work. Whenever I pass rich text (either generated by the new "Use Model" intent or generated manually for example using "Make Rich Text from Markdown"), my Intent gets an AttributedString with the correct characters, but with all attributes stripped (so in effect just plain text). struct TestIntent: AppIntent { static var title = LocalizedStringResource(stringLiteral: "Test Intent") static var description = IntentDescription("Tests Attributed Strings in Intent Parameters.") @Parameter var text: AttributedString func perform() async throws -> some IntentResult & ReturnsValue<AttributedString> { return .result(value: text) } } Is there anything else I am missing?
0
0
184
Jul ’25
Problem running NLContextualEmbeddingModel in simulator
Environment MacOC 26 Xcode Version 26.0 beta 7 (17A5305k) simulator: iPhone 16 pro iOS: iOS 26 Problem NLContextualEmbedding.load() fails with the following error In simulator Failed to load embedding from MIL representation: filesystem error: in create_directories: Permission denied ["/var/db/com.apple.naturallanguaged/com.apple.e5rt.e5bundlecache"] filesystem error: in create_directories: Permission denied ["/var/db/com.apple.naturallanguaged/com.apple.e5rt.e5bundlecache"] Failed to load embedding model 'mul_Latn' - '5C45D94E-BAB4-4927-94B6-8B5745C46289' assetRequestFailed(Optional(Error Domain=NLNaturalLanguageErrorDomain Code=7 "Embedding model requires compilation" UserInfo={NSLocalizedDescription=Embedding model requires compilation})) in #Playground I'm new to this embedding model. Not sure if it's caused by my code or environment. Code snippet import Foundation import NaturalLanguage import Playgrounds #Playground { // Prefer initializing by script for broader coverage; returns NLContextualEmbedding? guard let embeddingModel = NLContextualEmbedding(script: .latin) else { print("Failed to create NLContextualEmbedding") return } print(embeddingModel.hasAvailableAssets) do { try embeddingModel.load() print("Model loaded") } catch { print("Failed to load model: \(error)") } }
0
0
301
3w