Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.

All subtopics
Posts under Machine Learning & AI topic

Post

Replies

Boosts

Views

Created

The CoreML MultiArray Float16 input is not supported for running on the NPU, and this issue only occurs on the iPhone 11.
Xcode Version: Version 15.2 (15C500b) com.github.apple.coremltools.source: torch==1.12.1 com.github.apple.coremltools.version: 7.2 Compute: Mixed (Float16, Int32) Storage: Float16 The input to the mlpackage is MultiArray (Float16 1 × 1 × 544 × 960) The flexibility is: 1 × 1 × 544 × 960 | 1 × 1 × 384 × 640 | 1 × 1 × 736 × 1280 | 1 × 1 × 1088 × 1920 I tested this on iPhone XR, iPhone 11, iPhone 12, iPhone 13, and iPhone 14. On all devices except the iPhone 11, the model runs correctly on the NPU. However, on the iPhone 11, the model runs on the CPU instead. Here is the CoreMLTools conversion code I used: mlmodel = ct.convert(trace, inputs=[ct.TensorType(shape=input_shape, name="input", dtype=np.float16)], outputs=[ct.TensorType(name="output", dtype=np.float16, shape=output_shape)], convert_to='mlprogram', minimum_deployment_target=ct.target.iOS16 )
3
0
875
Sep ’24
[NewbQs] Is this possible with AppIntentDomains?
As a user, when viewing a photo or image, I want to be able to tell Siri, “add this to ”, similar to example from the WWDC presentation where a photo is added to a note in the notes app. Is this... possible with app domains as they are documented? I see domains like open-file and open-photo, but I don't know if those are appropriate for this kind of functionality?
1
0
567
Sep ’24
iOS 18: Siri not passing string parameters to AppIntents if the string is a question
Xcode Version 16.0 (16A242d) iOS18 - Swift There seems to be a behavior change on iOS18 when using AppShortcuts and AppIntents to pass string parameters. After Siri prompts for a string property requestValueDialog, if the user makes a statement the string is passed. If the user's statement is a question, however, the string is not sent to the AppIntent and instead Siri attempts to answer that question. Example Code: struct MyAppNameShortcuts: AppShortcutsProvider { @AppShortcutsBuilder static var appShortcuts: [AppShortcut] { AppShortcut( intent: AskQuestionIntent(), phrases: [ "Ask \(.applicationName) a question", ] ) } } struct AskQuestionIntent: AppIntent { static var title: LocalizedStringResource = .init(stringLiteral: "Ask a question") static var openAppWhenRun: Bool = false static var parameterSummary: some ParameterSummary { Summary("Search for \(\.$query)") } @Dependency private var apiClient: MockApiClient @Parameter(title: "Query", requestValueDialog: .init(stringLiteral: "What would you like to ask?")) var query: String // perform is not called if user asks a question such as "What color is the moon?" in response to requestValueDialog // iOS 17, the same string is passed though @MainActor func perform() async throws -> some IntentResult & ProvidesDialog & ShowsSnippetView { print("Query is: \(query)") let queryResult = try await apiClient.askQuery(queryString: query) let dialog = IntentDialog( full: .init(stringLiteral: queryResult.answer), supporting: .init(stringLiteral: "The answer to \(queryResult.question) is...") ) let view = SiriAnswerView(queryResult: queryResult) return .result(dialog: dialog, view: view) } } Given the above mock code: iOS17: Hey Siri Ask (AppName) a question Siri responds "What would you like to ask?" Say "What color is the moon?" String of "What color is the moon?" is passed to the AppIntent iOS18: Hey Siri Ask (AppName) a question Siri responds "What would you like to ask?" Say "What color is the moon?" Siri answers the question "What color is the moon?" Follow above steps again and instead reply "Moon" "Moon" is passed to AppIntent Basically any interrogative string parameters seem to be intercepted and sent to Siri proper rather than the provided AppIntent in iOS 18
1
0
871
Sep ’24
Install jax on macOS 15.1 Beta (24B5046f)
Following this instruction to install jax (https://developer.apple.com/metal/jax/), I still encountered this error: RuntimeError: This version of jaxlib was built using AVX instructions, which your CPU and/or operating system do not support. This error is frequently encountered on macOS when running an x86 Python installation on ARM hardware. In this case, try installing an ARM build of Python. Otherwise, you may be able work around this issue by building jaxlib from source. How to fix it?
1
0
1.6k
Sep ’24
Apple AI / Data Protection & Processing
Where does the processing power to enact certain AI capabilities come from? Is it hosted on the originating device? Or does the device send contents of originating information to Apple assets to process and give product to end user? e.g. If I ask AI to summarize an email will it send the contents of the email to an Apple AI asset to process it and give the summary to the originating device.
0
0
572
Sep ’24
Tensorflow-metal: Problems with Keras 3.0
The following code taken from keras.io produces the error InternalError: Exception encountered when calling GPT2Tokenizer.call(). ... 2 root error(s) found. (0) INTERNAL: stream cannot wait for itself Macos on Macbook, M2 Max. Setting the optimizer to "Adam" does not help. import keras_nlp # version 0.15 causal_lm = keras_nlp.models.GPT2CausalLM.from_preset("gpt2_base_en") causal_lm.compile(sampler="greedy") # the next call produces the error causal_lm.generate(["Keras is a"])
1
0
937
Sep ’24
iOS18 using VNRecognizeTextRequest2 but VNRecognizeTextRequest3 used
VNRecognizeTextRequest2 did not recognize the upside down text of English text. VNRecognizeTextRequest3 can recognize the text even if English text is upside down. Till iOS 17, I can select VNRecognizeTextRequest2 or VNRecognizeTextRequest3 in my code which is minimum build is iOS16 when I need upside down text detection required.. But on iOS18, even if I set the VNRecognizeTextRequest2 in my code, result seems to be based on the VNRecognizeTextRequest3 because upside down text is detected. VNRecognizeTextRequest2 was deplicant on iOS18, I know. How can I recognize the observation result is upside down or not? Are there any solution with VNRecognizeTextRequest3?
1
0
566
Sep ’24
CoreML, Invalid indexing on GPU
i believe i am encountering a bug in the MPS backend of CoreML. i believe there is an invalid conversion of a slice_by_index + gather operation resulting in indexing the wrong values on GPU execution. the following is a python program using the coremltools library illustrating the issue: from coremltools.converters.mil import Builder as mb from coremltools.converters.mil.mil import types dB = 20480 shapeI = (2, dB) shapeB = (dB, 22) @mb.program(input_specs=[mb.TensorSpec(shape=shapeI, dtype=types.int32), mb.TensorSpec(shape=shapeB)]) def prog(i, b): lslice = mb.slice_by_index(x=i, begin=[0, 0], end=[1, dB], end_mask=[False, True], squeeze_mask=[True, False], name='slice_left') rslice = mb.slice_by_index(x=i, begin=[1, 0], end=[2, dB], end_mask=[False, True], squeeze_mask=[True, False], name='slice_right') ldata = mb.gather(x=b, indices=lslice) rdata = mb.gather(x=b, indices=rslice) # actual bug in optimization of gather+slice x = mb.add(x=ldata, y=rdata) # dummy ops to make a bigger graph to run on GPU x = mb.mul(x=x, y=2.) x = mb.mul(x=x, y=.5) x = mb.mul(x=x, y=2.) x = mb.mul(x=x, y=.5) x = mb.mul(x=x, y=2.) x = mb.mul(x=x, y=.5) x = mb.mul(x=x, y=2.) x = mb.mul(x=x, y=.5) x = mb.mul(x=x, y=2.) x = mb.mul(x=x, y=.5) x = mb.mul(x=x, y=2.) x = mb.mul(x=x, y=.5) x = mb.mul(x=x, y=2.) x = mb.mul(x=x, y=.5) x = mb.mul(x=x, y=1., name='result') return x input_types = [ ct.TensorType(name="i", shape=shapeI, dtype=np.int32), ct.TensorType(name="b", shape=shapeB, dtype=np.float32), ] with tempfile.TemporaryDirectory() as tmpdirname: model_cpu = ct.convert(prog, inputs=input_types, compute_precision=ct.precision.FLOAT32, compute_units=ct.ComputeUnit.CPU_ONLY, package_dir=tmpdirname + 'model_cpu.mlpackage') model_gpu = ct.convert(prog, inputs=input_types, compute_precision=ct.precision.FLOAT32, compute_units=ct.ComputeUnit.CPU_AND_GPU, package_dir=tmpdirname + 'model_gpu.mlpackage') inputs = { "i": torch.randint(0, shapeB[0], shapeI, dtype=torch.int32), "b": torch.rand(shapeB, dtype=torch.float32), } cpu_output = model_cpu.predict(inputs) gpu_output = model_gpu.predict(inputs) # equivalent to prog expected = inputs["b"][inputs["i"][0]] + inputs["b"][inputs["i"][1]] # what actually happens on GPU actual = inputs["b"][inputs["i"][0]] + inputs["b"][inputs["i"][0]] print(f"diff expected vs cpu: {np.sum(np.absolute(expected - cpu_output['result']))}") print(f"diff expected vs gpu: {np.sum(np.absolute(expected - gpu_output['result']))}") print(f"diff actual vs gpu: {np.sum(np.absolute(actual - gpu_output['result']))}") the issue seems to occur in the slice_right + gather operations when executed on GPU. the wrong items in input "i" are selected. the program outpus diff expected vs cpu: 0.0 diff expected vs gpu: 150104.015625 diff actual vs gpu: 0.0 this behavior has been tested on MacBook Pro 14inches 2023, (M2 pro) on mac os 14.7, using coremltools 8.0b2 with python 3.9.19
3
0
651
Sep ’24
Can't disable Writing Tools for SwiftUI TextField
I'm trying to disable Writing Tools for a specific TextField using .writingToolsBehavior(.disabled), but when running the app on my iPhone 16 Pro with Apple Intelligence enabled, I can still use Writing Tools on the text box. I also see no difference with .writingToolsBehavior(.limited). Is there something I'm doing wrong or is this a bug? Sample code below: import SwiftUI struct ContentView: View { @State var text = "" var body: some View { VStack { TextField("Enter Text", text: $text) .writingToolsBehavior(.disabled) } .padding() } } #Preview { ContentView() }
4
0
1.2k
Sep ’24
Can Writing Tools be accessed In UITableView contextMenu?
I’m currently developing an app that features a main view with a UITableView. When users select a row, they are navigated to a detail view that contains a UITextField. This UITextField already supports Writing Tools. My question is: When a user long-presses a UITableView cell, is it possible to add a Writing Tools option to the Context Menu, allowing users to interact with the Writing Tools more conveniently?like Summary detail text
0
0
454
Sep ’24
Apple Music EQ settings
Was just wondering, not sure if anyone else had thought about this. but different sound output device have different mechanism of sound throw. can we not put in something which can go into bluetooth settings and overseeing if it is a music device connected would automatically set the EQ differently( as per user requirement) So its somewhat like each music device would have specific music EQ stored for the same which can be recognized via bluetooth.
1
0
535
Sep ’24
The Vision request does not work in simulator with Error "Could not create inference context"
When I use VNGenerateForegroundInstanceMaskRequest to generate the mask in the simulator by SwiftUI, there is an error "Could not create inference context". Then I add the code to make the vision by CPU: let request = VNGenerateForegroundInstanceMaskRequest() let handler = VNImageRequestHandler(ciImage: inputImage) #if targetEnvironment(simulator) if #available(iOS 18.0, *) { let allDevices = MLComputeDevice.allComputeDevices for device in allDevices { if(device.description.contains("MLCPUComputeDevice")){ request.setComputeDevice(.some(device), for: .main) break } } } else { // Fallback on earlier versions request.usesCPUOnly = true } #endif do { try handler.perform([request]) if let result = request.results?.first { let mask = try result.generateScaledMaskForImage(forInstances: result.allInstances, from: handler) return CIImage(cvPixelBuffer: mask) } } catch { print(error) } Even I force the simulator to run the code by CPU, but it still have the error: "Could not create inference context"
2
1
1k
Sep ’24
Issue with OCR on Swift iOS App: Roboflow API Bounding Boxes Missing After Response
Hi everyone, I'm working on an iOS app built in Swift using Xcode, where I'm integrating Roboflow's object detection API to extract items from grocery receipts. My goal is to identify key information (like items, total, tax, etc.) from the images of these receipts. I'm successfully sending images to the Roboflow API and receiving predictions with bounding box data, but when I attempt to extract text from the detected regions (bounding boxes), it appears that the text extraction is failing—no text is being recognized. The issue seems to be that the bounding boxes are either not properly being handled or something is going wrong in the way I process the API response. Here's a brief breakdown of what I'm doing: The image is captured, converted to base64, and sent to the Roboflow API. The API response comes back with bounding boxes for the detected elements (items, date, subtotal, etc.). The problem occurs when I try to extract the text from the image using the bounding box data—it seems like the bounding boxes are being found, but no text is returned. I suspect the issue might be happening because the app’s segue to the results view controller is triggered before the OCR extraction completes, or there might be a problem in my code handling the bounding box response. Response Data: { "inference_id": "77134cce-91b5-4600-a59b-fab74350ca06", "time": 0.09240847699993537, "image": { "width": 370, "height": 502 }, "predictions": [ { "x": 163.5, "y": 250.5, "width": 313.0, "height": 127.0, "confidence": 0.9357666373252869, "class": "Item", "class_id": 1, "detection_id": "753341d5-07b6-42a1-8926-ecbc61128243" }, { "x": 52.5, "y": 417.5, "width": 89.0, "height": 23.0, "confidence": 0.8819760680198669, "class": "Date", "class_id": 0, "detection_id": "b4681149-d538-47b1-8700-d9528bf1daa0" }, ... ] } And the log showing bounding boxes: Prediction: ["width": 313, "y": 250.5, "x": 163.5, "detection_id": 753341d5-07b6-42a1-8926-ecbc61128243, "class": Item, "height": 127, "confidence": 0.9357666373252869, "class_id": 1] No bounding box found in prediction. I've double-checked the bounding box coordinates, and everything seems fine. Does anyone have experience with using OCR alongside object detection APIs in Swift? Any help on how to ensure the bounding boxes are properly processed and used for OCR would be greatly appreciated! Also, would it help to delay the segue to the results view controller until OCR is complete? Thank you!
0
0
582
Sep ’24
Issue with Optimizing Stable Diffusion XL Model for iOS 18
Hi everyone, I’m currently in the process of converting and optimizing the Stable Diffusion XL model for iOS 18. I followed the steps from the WWDC 2024 session on model optimization, specifically the one titled "Bring your machine learning and AI models to Apple Silicon." I utilized the Stable Diffusion XL model and the tools available in the ml-stable-diffusion GitHub repository and ran the following script to convert the model into an .mlpackage: python3 -m python_coreml_stable_diffusion.torch2coreml \ --convert-unet \ --convert-vae-decoder \ --convert-text-encoder \ --xl-version \ --model-version stabilityai/stable-diffusion-xl-base-1.0 \ --bundle-resources-for-swift-cli \ --refiner-version stabilityai/stable-diffusion-xl-refiner-1.0 \ --attention-implementation SPLIT_EINSUM \ -o ../PotraitModel/ \ --custom-vae-version madebyollin/sdxl-vae-fp16-fix \ --latent-h 128 \ --latent-w 96 \ --chunk-unet The model conversion worked without any issues. However, when I proceeded to optimize the model in a Jupyter notebook, following the same process shown in the WWDC session, I encountered an error during the post-training quantization step. Here’s the code I used for that: op_config = cto_coreml.0pPalettizerConfig( nbits=4, mode="kmeans", granularity="per_grouped_channel", group_size=16, ) config = cto_coreml.OptimizationConfig(op_config) compressed_model = cto_coreml.palettize_weights(mlmodel, config) Unfortunately, I received the following error: AssertionError: The IOS16 only supports per-tensor LUT, but got more than one lut on 0th axis. LUT shape: (80, 1, 1, 1, 16, 1) It appears that the minimum deployment target of the MLModel is set to iOS 16, which might be causing compatibility issues. How can I update the minimum deployment target to iOS 18? If anyone has encountered this issue or knows a workaround, I would greatly appreciate your guidance! Thanks in advance for any help!
2
0
979
Sep ’24
Genmoji developer support
Trying to experiment with Genmoji per the WWDC documentation and samples, but I don't seem to get Genmoji keyboard. I see this error in my log: Received port for identifier response: <(null)> with error:Error Domain=RBSServiceErrorDomain Code=1 "Client not entitled" UserInfo={RBSEntitlement=com.apple.runningboard.process-state, NSLocalizedFailureReason=Client not entitled, RBSPermanent=false} elapsedCPUTimeForFrontBoard couldn't generate a task port Is anything presently supported for developers? All I have done here is a simple app with a UITextView and code for: textView.supportsAdaptiveImageGlyph = true Any thoughts?
0
0
606
Sep ’24