VisionKit

RSS for tag

Scan documents with the camera on iPhone and iPad devices using VisionKit.

VisionKit Documentation

Posts under VisionKit tag

29 Posts
Sort by:
Post not yet marked as solved
0 Replies
151 Views
Hi, I have followed instruction but I do not have the option to scan a document in notes, any suggestion what I could do, this was a feature I was looking forward too :-(
Posted
by
Post not yet marked as solved
0 Replies
178 Views
Please can anyone suggest if they have attempted to have camera on and dynamic text overlaying done depending on what is identified in the view. Eg. point a camera to the fruit and i should be able to identify the fruit and display text over the camera feed. The moment i move to next object it should ask me if ii want to save this or discard to move to new object.
Posted
by
Post marked as solved
1 Replies
208 Views
I'm using the Vision OCR (with VNRecognizeTextRequest) to detect text on images. For our specific use-case, we need to know the position of each of the letters, and we can do this with the function: recognizedText.boundingBox(for: (idx1..<idx2)) (where idx2 = idx1 + 1). However, this results is only valid when the recognition level flag of the request is set to .fast, as when it is set to .accurate, the bounding box for any letter is not the bounding box of the letter itself, but the bounding box of the whole word containing the letter. Basically, this is the same problem as the one described here: https://developer.apple.com/forums/thread/131510 The issue is we cannot use the .fast recognition level, as the text might be tilted, plus the letters are often hard to read with pretty bad contrast, and this produces unusable results with the .fast setting. Does anyone know: if there is a way to directly extract the bounding box of the letters from the VNRecognizedTextObservation with the .accurate setting ? if there is an update / feature adjust planned on this issue, or if the Vision Dev team doesn't care about this issue ? Is there even a way to ask for a Bug fix on this issue for the dev team ? We do really need this feature, so any info is a good info. Thanks in advance for your answers.
Posted
by
Post marked as solved
3 Replies
357 Views
Hello, I am trying to play around with the Live Text API according to this docs - https://developer.apple.com/documentation/visionkit/enabling_live_text_interactions_with_images?changes=latest_minor But it always fails with [api] -[CIImage initWithCVPixelBuffer:options:] failed because the buffer is nil. I am running this on a UIImage instance that I got from VNDocumentCameraViewController. This is my current implementation that I run after the scanned image is displayed: private func setupLiveText() { guard let image = imageView.image else { return } let interaction = ImageAnalysisInteraction() imageView.addInteraction(interaction) Task { let configuration = ImageAnalyzer.Configuration([.text]) let analyzer = ImageAnalyzer() do { let analysis = try await analyzer.analyze(image, configuration: configuration) DispatchQueue.main.async { interaction.analysis = analysis } } catch { print(error.localizedDescription) } } } It does not fail, it returns non-nil analysis object, but setting it to the interaction does nothing. I am testing this on iPhone SE 2020 which has the A13 chip. This feature requires A12 and up.
Posted
by
Post marked as solved
2 Replies
327 Views
Hi, The presentation "Capture Machine Readable Codes and Text with VisionKit" mentions at the end that the DataScannerViewController can be used with an async stream. In the presentation, there is a code snipper for the updateViewAsyncStream method, but it's not really used anywhere. How do utilize this when the DataScannerViewController is active and capture the recognized items? Also there is a sendDidChangeNotification() function sat the end but the compiler complains that it's not in scope. Thanks.
Posted
by
Post not yet marked as solved
1 Replies
108 Views
I work on an iOS app that displays images that often contain text, and I'm adding support for ImageAnalysisInteraction as described in this WWDC 2022 session. I have gotten as far as making the interaction show up and being able to select text and get the system selection menu, and even add my own action to the menu via the buildMenuWithBuilder API. But what I really want to do with my custom action is get the selected text and do a custom lookup-like thing to check the text against other content in my app. So how do I get the selected text from an ImageAnalysisInteraction on a UIImageView? The docs show methods to check if there is selected text, but I want to know what the text is.
Posted
by
Post not yet marked as solved
1 Replies
23 Views
I am reading the image text from Vision kit (OCR) capabilities and trying to find out the title of the document. This seems pretty obvious in case if Title is on the top of the Document. But in some cases, for example, if I am reading a Business card, etc, sometimes appears somewhere in the middle of the card. While debugging, I found that there is an isTtile field (screenshot attached) VNRecognizedTextObservation. but I am not able to access it? is this private? Although I don't see a clear reason to have this property to be private.
Posted
by