Explore the power of machine learning within apps. Discuss integrating machine learning features, share best practices, and explore the possibilities for your app.

Posts under General subtopic

Post

Replies

Boosts

Views

Activity

RecognizeDocumentsRequest for receipts
Hi, I'm trying to use the new RecognizeDocumentsRequest from the Vision Framework to read a receipt. It looks very promising by being able to read paragraphs, lines and detect data. So far it unfortunately seems to read every line on the receipt as a paragraph and when there is more space on one line it creates two paragraphs. Is there perhaps an Apple Engineer who knows if this is expected behaviour or if I should file a Feedback for this? Code setup: let request = RecognizeDocumentsRequest() let observations = try await request.perform(on: image) guard let document = observations.first?.document else { return } for paragraph in document.paragraphs { print(paragraph.transcript) for data in paragraph.detectedData { switch data.match.details { case .phoneNumber(let data): print("Phone: \(data)") case .postalAddress(let data): print("Postal: \(data)") case .calendarEvent(let data): print("Calendar: \(data)") case .moneyAmount(let data): print("Money: \(data)") case .measurement(let data): print("Measurement: \(data)") default: continue } } } See attached image as an example of a receipt I'd like to parse. The top 3 lines are the name, street, and postal code + city. These are all separate paragraphs. Checking on detectedData does see the street (2nd line) as PostalAddress, but not the complete address. Might that be a location thing since it's a Dutch address. And lower on the receipt it sees the block with "Pomp 1 95 Ongelood" and the things below also as separate paragraphs. First picking up the left side and after that the right side. So it's something like this: * Pomp 1 Volume Prijs € TOTAAL * BTW Netto 21.00 % 95 Ongelood 41,90 l 1.949/ 1 81.66 € 14.17 67.49
2
1
278
2w
Vision framework OCR missing Swedish support?
WWDC 2024 mentioned that the OCR feature from the Vision framework has support for "Korean, Swedish, and Chinese", but the Swedish support does not seem to be available... Running either print(try? VNRecognizeTextRequest().supportedRecognitionLanguages()) or var ocrRequest = RecognizeTextRequest(.revision3) print(ocrRequest.supportedRecognitionLanguages) did not print out Swedish as one of the supported languages, but Korean and Chinese are. Tested on early versions of iOS 18 developer beta, and the latest version of iOS 18.1 (22B5054e).
1
0
656
Oct ’24
Integer arithmetic with Accelerate
Almost all the functions in Accelerate are for single precision (Float) and double precision (Double) operations. However, I stumbled upon three integer arithmetic functions which operate on Int32 values. Are there any more functions in Accelerate that operate on integer values? If not, then why aren't there more functions that work with integers?
1
0
529
Oct ’24
Issues with Statsmodels
When I import starts models in Jupyter notebook, I ge the following error: ImportError: dlopen(/opt/anaconda3/lib/python3.12/site-packages/scipy/linalg/_fblas.cpython-312-darwin.so, 0x0002): Library not loaded: @rpath/liblapack.3.dylib Referenced from: <5ACBAA79-2387-3BEF-9F8E-6B7584B0F5AD> /opt/anaconda3/lib/python3.12/site-packages/scipy/linalg/_fblas.cpython-312-darwin.so Reason: tried: '/opt/anaconda3/lib/python3.12/site-packages/scipy/linalg/../../../../liblapack.3.dylib' (no such file), '/opt/anaconda3/lib/python3.12/site-packages/scipy/linalg/../../../../liblapack.3.dylib' (no such file), '/opt/anaconda3/bin/../lib/liblapack.3.dylib' (no such file), '/opt/anaconda3/bin/../lib/liblapack.3.dylib' (no such file), '/usr/local/lib/liblapack.3.dylib' (no such file), '/usr/lib/liblapack.3.dylib' (no such file, not in dyld cache). What should I do?
1
0
687
Oct ’24
Help with TensorFlow to CoreML Conversion: AttributeError: 'float' object has no attribute 'astype'
Hello, I’m attempting to convert a TensorFlow model to CoreML using the coremltools package, but I’m encountering an error during the conversion process. The error traceback points to an issue within the Cast operation in the MIL (Model Intermediate Layer) when it tries to perform type inference: AttributeError: 'float' object has no attribute 'astype' Here is the relevant part of the error traceback: File ~/.pyenv/versions/3.10.12/lib/python3.10/site-packages/coremltools/converters/mil/mil/ops/defs/iOS15/elementwise_unary.py", line 896, in get_cast_value return input_var.val.astype(dtype=type_map[dtype_val]) I’ve tried converting a model from the yamnet-tensorflow2 repository, and this error occurs when CoreML tries to cast a float type during the conversion of certain operations. I’m currently using Python 3.10 and coremltools version 6.0.1, with TensorFlow 2.x. Has anyone encountered a similar issue or can offer suggestions on how to resolve this? I’ve also considered that this might be related to mismatches in the model’s data types, but I’m not sure how to proceed. Platform and package versions: coremltools 6.1 tensorflow 2.10.0 tensorflow-estimator 2.10.0 tensorflow-hub 0.16.1 tensorflow-io-gcs-filesystem 0.37.1 Python 3.10.12 pip 24.3.1 from ~/.pyenv/versions/3.10.12/lib/python3.10/site-packages/pip (python 3.10) Darwin MacBook-Pro.local 24.1.0 Darwin Kernel Version 24.1.0: Thu Oct 10 21:02:27 PDT 2024; root:xnu-11215.41.3~2/RELEASE_X86_64 x86_64 Any help or pointers would be greatly appreciated!
2
0
1.1k
Nov ’24
About VisionKit DataScannerViewController
Hi I'm having a problem with DataScannerViewController, I'm using the volume barcode scanning feature in my app, prior to that I was using an AVCaptureDevice with the UltraWideAngle set. After discovering DataScannerViewController, we planned to replace the previous obsolete code with DataScannerViewController, all together it was ok, when I want to set the ultra wide angle, I don't know how to start. I tried to get the minZoomFactor and I realized that I get 0.0 I tried to set zoomFactor to 1.0 and I found that he is not valid Note: func dataScannerDidZoom(_ dataScanner: DataScannerViewController), when I try to get the minZoomFactor, set the zoomFactor in this proxy method, I find that it is valid! What should I do next, I want to use only DataScannerViewController and implement ultra wide angle Thanks a lot.
1
0
616
Jan ’25
BarcodeObservation Orientation
Hi, I'm working with vision framework to detect barcodes. I tested both ean13 and data matrix detection and both are working fine except for the QuadrilateralProviding values in the returned BarcodeObservation. TopLeft, topRight, bottomRight and bottomLeft coordinates are rotated 90° counter clockwise (physical bottom left of data Matrix, the corner of the "L" is returned as the topLeft point in observation). The same behaviour is happening with EAN13 Barcode. Did someone else experienced the same issue with orientation? Is it normal behaviour or should we expect a fix in next releases of the Vision Framework?
4
0
563
Jan ’25
How to use a Decimal as @Property of AppEntity
I’m trying to use a Decimal as a @Property in my AppEntity, but using the following code shows me a compiler error. I’m using Xcode 16.1. The documentation notes the following: You can use the @Parameter property wrapper with common Swift and Foundation types: Primitives such as Bool, Int, Double, String, Duration, Date, Decimal, Measurement, and URL. Collections such as Array and Set. Make sure the collection’s elements are of a type that’s compatible with IntentParameter. Everything works fine for other primitives as bools, strings and integers. How do I use the Decimal though? Code struct MyEntity: AppEntity { var id: UUID @Property(title: "Amount") var amount: Decimal // … } Compiler Error This error appears at the line of the @Property definition: Generic class 'EntityProperty' requires that 'Decimal' conform to '_IntentValue'
1
0
627
Dec ’24
Easy way to implement facial recognition
Hi everyone😊, I want to implement facial recognition into my app. I was planning to use createML's image classification, but there seams to be a lot of hassle to implement (the JSON file etc.). Are there some other easy to implement options that don't involve advanced coding. Thanks, Oliver
2
0
557
Feb ’25
Xcode AI Coding Assistance Option(s)
Not finding a lot on the Swift Assist technology announced at WWDC 2024. Does anyone know the latest status? Also, currently I use OpenAI's macOS app and its 'Work With...' functionality to assist with Xcode development, and this is okay, certainly saves copying code back and forth, but it seems like AI should be able to do a lot more to help with Xcode app development. I guess I'm looking at what people are doing with AI in Visual Studio, Cline, Cursor and other IDEs and tools like those and feel a bit left out working in Xcode. Please let me know if there are AI tools or techniques out there you use to help with your Xcode projects. Thanks in advance!
6
0
11k
Mar ’25
What special features does Apple officially have that use ML or AI?
I am a App designer and I am curious about what specific ML or AI Apple used to develop those features in the system. As far as I know, Apple's hand-raising detection, destination recommendations in maps, and exercise types in fitness all use ML. Are there more specific application examples of ML or AI? Does Apple have a document specifically introducing examples of specific applications of ML or AI technology in the system?
1
0
580
Feb ’25
Metal GPU Work Won't Stop
Is there any way to stop GPU work running that is scheduled using metal? Long shader calculations don't stop when application is stopped in Xcode and continue to take up GPU time and affect the display. Why is this functionality not available when Swift Tasks are able to be canceled?
2
0
724
Feb ’25
MPSGraph fused scaledDotProductAttention seems to be buggy
While building an app with large language model inferencing on device, I got gibberish output. After carefully examining every detail, I found it's caused by the fused scaledDotProductAttention operation. I switched back to the discrete operations and problem solved. To reproduce the bug, please check https://github.com/zhoudan111/MPSGraph_SDPA_bug
1
0
499
Mar ’25
Selecting GPU for TensorFlow-Metal on Mac Pro (2013) with v0.8.0
Hi everyone, I'm a Mac enthusiast experimenting with tensorflow-metal on my Mac Pro (2013). My question is about GPU selection in tensorflow-metal (v0.8.0), which still supports Intel-based Macs, including my machine. I've noticed that when running TensorFlow with Metal, it automatically selects a GPU, regardless of what I specify using device indices like "gpu:0", "gpu:1", or "gpu:2". I'm wondering if there's a way to manually specify which GPU should be used via an environment variable or another method. For reference, I’ve tried the example from TensorFlow’s guide on multi-GPU selection: https://www.tensorflow.org/guide/gpu#using_a_single_gpu_on_a_multi-gpu_system My goal is to explore performance optimizations by using MirroredStrategy in TensorFlow to leverage multiple GPUs: https://www.tensorflow.org/guide/distributed_training#mirroredstrategy Interestingly, I discovered that the metalcompute Python library (https://pypi.org/project/metalcompute/) allows to utilize manually selected GPUs on my system, allowing for proper multi-GPU computations. This makes me wonder: Is there a hidden environment variable or setting that allows manual GPU selection in tensorflow-metal? Has anyone successfully used MirroredStrategy on multiple GPUs with tensorflow-metal? Would a bridge between metalcompute and tensorflow-metal be necessary for this use case, or is there a more direct approach? I’d love to hear if anyone else has experimented with this or has insights on getting finer control over GPU selection. Any thoughts or suggestions would be greatly appreciated! Thanks!
3
0
189
Mar ’25
VNRecognizeTextRequest: .automatic vs specific language: different results?
Hi, One can configure the languages of a (VN)RecognizeTextRequest with either: .automatic: language to be detected a specific language, say Spanish If the request is configured with .automatic and successfully detects Spanish, will the results be exactly equivalent compared to a request made with Spanish set as language? I could not find any information about this, and this is very important for the core architecture of my app. Thanks!
2
0
75
Apr ’25