VisionKit

Scan documents with the camera on iPhone and iPad devices using VisionKit.

VisionKit Documentation

Post

Replies

Boosts

Views

Activity

ImageAnalysisInteraction doesn't call contentsRect delegate's method

Hello, I am struggling with an issue that contentsRect(for:) method of ImageAnalysisInteractionDelegate is not being called at any moment. I've set up the demo project where the interaction is added to root view of view controller while I'm analyzing the image of UIImageView that is added to this view. I want to achieve the behavior where I could define contents rect for highlights of found text on that image. P.S. I know that I could simply add an interaction to an image view but that's not the case - the real work that I want to achieve in the real project is to display live text on paused video player, so that image view here is for simplicity only. import UIKit import VisionKit class ViewController: UIViewController { private let imageView = UIImageView() private let imageAnalyzer = ImageAnalyzer() private let interaction = ImageAnalysisInteraction() override func viewDidLoad() { super.viewDidLoad() view.addSubview(imageView) imageView.translatesAutoresizingMaskIntoConstraints = false NSLayoutConstraint.activate([ imageView.leadingAnchor.constraint(equalTo: view.leadingAnchor), imageView.trailingAnchor.constraint(equalTo: view.trailingAnchor), imageView.centerYAnchor.constraint(equalTo: view.centerYAnchor), imageView.heightAnchor.constraint(equalTo: view.heightAnchor, multiplier: 0.7) ]) interaction.delegate = self // Some image with text that I have in assets imageView.image = UIImage(named: "IMG_5564") imageView.contentMode = .scaleAspectFit view.addInteraction(interaction) interaction.setContentsRectNeedsUpdate() DispatchQueue.main.asyncAfter(deadline: .now() + 1.0) { self.analyze() } } private func analyze() { Task { let imageAnalysis = try? await imageAnalyzer.analyze( self.imageView.image!, configuration: .init([.machineReadableCode, .text]) ) self.interaction.analysis = imageAnalysis self.interaction.preferredInteractionTypes = .automatic self.interaction.setContentsRectNeedsUpdate() } } } extension ViewController: ImageAnalysisInteractionDelegate { func presentingViewController(for interaction: ImageAnalysisInteraction) -> UIViewController? { return nil } func contentsRect(for interaction: ImageAnalysisInteraction) -> CGRect { // >>> This method is never being called <<< return CGRect(x: 0, y: 0, width: 1.0, height: 0.7) } func contentView(for interaction: ImageAnalysisInteraction) -> UIView? { return nil } func interaction(_ interaction: ImageAnalysisInteraction, highlightSelectedItemsDidChange highlightSelectedItems: Bool) { debugPrint("highlight: \(highlightSelectedItems)") } }

App & System Services General VisionKit WWDC22 WWDC22 Support Live Text

1.2k

Aug ’22

How to get the filters in VNDocumentCameraViewController without open a camara?

There are 4 filters in VNDocumentCameraViewController, "Color", "Grayscal", "Black & White", "Photo". Is there a way to apply the filters on UIImages directly without open a camara?

Media Technologies General Image I/O VisionKit Photos and Imaging Core Image

1.2k

Aug ’22

DataScannerViewController Scan Text is not working

Hi, I'm using XCode 14.0 beta 4 and iOS 16.0 Beta 2, I followed the tutorial on but when I'm trying to scan text, it does not show on the CameraScanner, actually, I think it did not call the func dataScanner but kept showing me the warning message: Custom words array can only contain strings. Ignoring custome words array. So in this case I can see the camera and highlight anchor, but I cannot extract the text the VisionKit detected, any workaround? Thanks in advance.

Community Apple Developers Beta VisionKit

1.1k

Aug ’22

DataScannerViewController in Objective-C

Hi, Is DataScannerViewController available to be called directly from Objective-C? I see the header file has an "objc" attribute on it, but trying to initialize it from an Objective-C file doesn't seem to be working for me. Maybe it's something I'm doing wrong, but I wanted to first clarify and confirm that if it indeed possible to use it directly in Objective-C, or not?

Machine Learning & AI General VisionKit wwdc2022-10024 wwdc2022-10025

1.1k

Aug ’22

Getting title of document via VNRecognizedTextObservation

I am reading the image text from Vision kit (OCR) capabilities and trying to find out the title of the document. This seems pretty obvious in case if Title is on the top of the Document. But in some cases, for example, if I am reading a Business card, etc, sometimes appears somewhere in the middle of the card. While debugging, I found that there is an isTtile field (screenshot attached) VNRecognizedTextObservation. but I am not able to access it? is this private? Although I don't see a clear reason to have this property to be private.

Machine Learning & AI General Vision VisionKit

937

Jul ’22

Using DataScannerViewController with async stream

Hi, The presentation "Capture Machine Readable Codes and Text with VisionKit" mentions at the end that the DataScannerViewController can be used with an async stream. In the presentation, there is a code snipper for the updateViewAsyncStream method, but it's not really used anywhere. How do utilize this when the DataScannerViewController is active and capture the recognized items? Also there is a sendDidChangeNotification() function sat the end but the compiler complains that it's not in scope. Thanks.

Machine Learning & AI General VisionKit wwdc2022-10025

2.0k

Jul ’22

Use the camera for keyboard input in your app

please give some example code for scan text with using button

Media Technologies Photos & Camera VisionKit Camera AVFoundation

1.1k

Jun ’22

[VisionKit Text Recognition] boundingBox(for:) returns wrong results when used with .accurate recognition level

I'm using the Vision OCR (with VNRecognizeTextRequest) to detect text on images. For our specific use-case, we need to know the position of each of the letters, and we can do this with the function: recognizedText.boundingBox(for: (idx1..<idx2)) (where idx2 = idx1 + 1). However, this results is only valid when the recognition level flag of the request is set to .fast, as when it is set to .accurate, the bounding box for any letter is not the bounding box of the letter itself, but the bounding box of the whole word containing the letter. Basically, this is the same problem as the one described here: https://developer.apple.com/forums/thread/131510 The issue is we cannot use the .fast recognition level, as the text might be tilted, plus the letters are often hard to read with pretty bad contrast, and this produces unusable results with the .fast setting. Does anyone know: if there is a way to directly extract the bounding box of the letters from the VNRecognizedTextObservation with the .accurate setting ? if there is an update / feature adjust planned on this issue, or if the Vision Dev team doesn't care about this issue ? Is there even a way to ask for a Bug fix on this issue for the dev team ? We do really need this feature, so any info is a good info. Thanks in advance for your answers.

Machine Learning & AI General VisionKit

1.2k

May ’22

Dynamic Text overlay on live camera feed

Please can anyone suggest if they have attempted to have camera on and dynamic text overlaying done depending on what is identified in the view. Eg. point a camera to the fruit and i should be able to identify the fruit and display text over the camera feed. The moment i move to next object it should ask me if ii want to save this or discard to move to new object.

UI Frameworks SwiftUI VisionKit SwiftUI Camera

720

Apr ’22

No option to scan in notes

Hi, I have followed instruction but I do not have the option to scan a document in notes, any suggestion what I could do, this was a feature I was looking forward too :-(

Machine Learning & AI General VisionKit

500

Apr ’22

Finding API on "scanning the document"

I'm finding the API on "scanning the document" which used on Notes App. I want to build functions of scanning a document automatically. So I need it. Please, tell me.

Programming Languages Swift Swift Playground Swift VisionKit

746

Mar ’22

Order of points in VNFaceLandmarkRegion2D

In my app, I am performing a VNDetectFaceLandmarksRequest with a VNSequenceRequestHandler. The video that serves as my input is from my iPhones selfie-camera. The request returns the VNFaceLandmarkRegion2D from where I get all the landmarks as an array of CGPoints via VNFaceLandmarkRegion2D.normalizedPoints I want to compare all the CGPoint-arrays over time, but I am not sure if a point at a certain index is always representing the same landmark. Can I assume that a specific landmark, e.g. the left-most landmark of the right eye, always has the same index in the CGPoint-array?

App & System Services Core OS iOS VisionKit Vision

910

Mar ’22

dynamic reading of text in live camera feed

i want to read text and dynamically display what the word means on the live camera feed.... any ideas? Should be able to read maximum 2 words on screen but user can tap to select the word/image & save the screen too..

UI Frameworks SwiftUI VisionKit SwiftUI AVFoundation

849

Mar ’22

Cannot find type "..." in scope

I am trying to make a QR code scanner. However, it keeps saying that the scanner is not in scope. How can I fix this?

Code Signing Notarization VisionKit Notarization

6.7k

Feb ’22

Bad quality of scanned documents

Hey guys, facing the issue that scanned documents on my iPhone 12 Pro Max with Files app are pretty bad quality. Guess it started with iOS 15 beta 3. Unfortunately issue still persists with current non beta iOS 15 release. It‘s the same on iPad OS 15. When I launch ‚scan with iPhone’ using Preview app on macOS quality is good as always. Hence looks like issue is related on files app or PDF processing on iPhone. Have anybody else seen the same? Thanx and cheers, Flory

App & System Services Networking PDFKit VisionKit Camera

15k

Jan ’22

I am a new developer. How can i make my output text editable?

I have made a Scan to Text app with the help of sources from the internet, but I can’t figure out a way to get my output text to be editable. Here’s my code private func makeScannerView()-> ScannerView { ScannerView(completion: { textPerPage in if let outputText = textPerPage?.joined(separator: "\n").trimmingCharacters(in: .whitespacesAndNewlines){ let newScanData = ScanData(content: outputText) self.texts.append(newScanData) } self.showScannerSheet = false }) }

App & System Services General VisionKit TextKit Vision Core Text

752

Jan ’22

Vision : VNDetectRectanglesRequest Error in iOS15

The VNDetectorOption_OriginatingRequestSpecifier required option was not found" UserInfo={NSLocalizedDescription=The VNDetectorOption_OriginatingRequestSpecifier required option was not found Facing this error in only iOS15 while finding observation.

Machine Learning & AI General VisionKit Vision

1.4k

Jan ’22

How to get the saliency mask of the VNDetectDocumentSegmentationRequest

In one of the WWDC videos, the VNDetectDocumentSegmentationRequest result is described in the following way: The result of the request is a low resolution segmentation mask, where each pixel represents a confidence if that pixel is part of the detected document or not. In addition it provides the four corner points of the quadrilateral. Similarly, in the VNDetectDocumentSegmentationRequest docs there's the following statement: The result that the request generates contains the four corner points of a document’s quadrilateral and saliency mask. So the first part ("four corner points of a document’s quadrilateral") is easy - it's in the results of the request, which are in VNRectangleObservation format: let request = VNDetectDocumentSegmentationRequest { (request, error) in guard let results = request.results as? [VNRectangleObservation] else { // Failed } // Process VNRectangleObservations } but how do I obtain the "low resolution segmentation mask" / "saliency mask" for VNDetectDocumentSegmentationRequest?

Machine Learning & AI General VisionKit wwdc21-10041

1.1k

Dec ’21

VNRecognizedText confidence values are only 0.5 and 1.0

I'm using Vision to conduct some OCR from a live camera feed. I've setup my VNRecognizeTextRequests as follows: let request = VNRecognizeTextRequest(completionHandler: recognizeTextCompletionHandler) request.recognitionLevel = .accurate request.usesLanguageCorrection = false And I handle the results as follows: guard let observations = request.results as? [VNRecognizedTextObservation] else { return } for observation in observations { if let recognizedText = observation.topCandidates(1).first { guard recognizedText.confidence >= self.confidenceLimit, // set to 0.5 let foundText = validateRegexPattern(text: recognizedText.string, regexPattern: self.regexPattern), let foundDecimal = Double(foundText) else { continue } } This is actually working great and yielding very accurate results, but the confidence values I'm receiving from the results are generally either 0.5 or 1.0, and rarely 0.3. I find these to be pretty nonsensical confidence values and I'm wondering if this is the intended result or some sort of bug. Conversely, using recognitionLevel = .fast yields more realistic and varied confidence values, but much less accurate results overall (even though fast is recommended for OCR from a live camera feed, I've had significantly better results using the accurate recognition level, which is why I've been using the accurate recognition level)

Machine Learning & AI General VisionKit Vision Machine Learning

1.2k

Nov ’21

Recognizing text

I'm using VNImageRequestHandler to recognize text using the camera. In my handler I'm using the topLeft, topRight, bottomLeft, bottomRight properties, which I'm scaling to the size of the canvas, to draw an outline around each text object. When I do this the Y position and Height are correct, but the Width is slightly smaller, and the X position centers the outline around the text. Any idea why this would be a different size?

Media Technologies Photos & Camera VisionKit Camera Core Graphics

1.2k

Nov ’21

ImageAnalysisInteraction doesn't call contentsRect delegate's method

App & System Services General VisionKit WWDC22 WWDC22 Support Live Text

Replies: 1
Boosts: 0
Views: 1.2k
Activity: Aug ’22

How to get the filters in VNDocumentCameraViewController without open a camara?

There are 4 filters in VNDocumentCameraViewController, "Color", "Grayscal", "Black & White", "Photo". Is there a way to apply the filters on UIImages directly without open a camara?

Media Technologies General Image I/O VisionKit Photos and Imaging Core Image

Replies: 1
Boosts: 1
Views: 1.2k
Activity: Aug ’22

DataScannerViewController Scan Text is not working

Community Apple Developers Beta VisionKit

Replies: 1
Boosts: 0
Views: 1.1k
Activity: Aug ’22

DataScannerViewController in Objective-C

Machine Learning & AI General VisionKit wwdc2022-10024 wwdc2022-10025

Replies: 1
Boosts: 0
Views: 1.1k
Activity: Aug ’22

Getting title of document via VNRecognizedTextObservation

Machine Learning & AI General Vision VisionKit

Replies: 1
Boosts: 1
Views: 937
Activity: Jul ’22

Using DataScannerViewController with async stream

Machine Learning & AI General VisionKit wwdc2022-10025

Replies: 2
Boosts: 0
Views: 2.0k
Activity: Jul ’22

Use the camera for keyboard input in your app

please give some example code for scan text with using button

Media Technologies Photos & Camera VisionKit Camera AVFoundation

Replies: 1
Boosts: 0
Views: 1.1k
Activity: Jun ’22

[VisionKit Text Recognition] boundingBox(for:) returns wrong results when used with .accurate recognition level

Machine Learning & AI General VisionKit

Replies: 1
Boosts: 0
Views: 1.2k
Activity: May ’22

Dynamic Text overlay on live camera feed

UI Frameworks SwiftUI VisionKit SwiftUI Camera

Replies: 0
Boosts: 0
Views: 720
Activity: Apr ’22

No option to scan in notes

Hi, I have followed instruction but I do not have the option to scan a document in notes, any suggestion what I could do, this was a feature I was looking forward too :-(

Machine Learning & AI General VisionKit

Replies: 0
Boosts: 0
Views: 500
Activity: Apr ’22

Finding API on "scanning the document"

I'm finding the API on "scanning the document" which used on Notes App. I want to build functions of scanning a document automatically. So I need it. Please, tell me.

Programming Languages Swift Swift Playground Swift VisionKit

Replies: 0
Boosts: 0
Views: 746
Activity: Mar ’22

Order of points in VNFaceLandmarkRegion2D

App & System Services Core OS iOS VisionKit Vision

Replies: 1
Boosts: 0
Views: 910
Activity: Mar ’22

dynamic reading of text in live camera feed

UI Frameworks SwiftUI VisionKit SwiftUI AVFoundation

Replies: 0
Boosts: 0
Views: 849
Activity: Mar ’22

Cannot find type "..." in scope

I am trying to make a QR code scanner. However, it keeps saying that the scanner is not in scope. How can I fix this?

Code Signing Notarization VisionKit Notarization

Replies: 3
Boosts: 0
Views: 6.7k
Activity: Feb ’22

Bad quality of scanned documents

App & System Services Networking PDFKit VisionKit Camera

Replies: 33
Boosts: 1
Views: 15k
Activity: Jan ’22

I am a new developer. How can i make my output text editable?

App & System Services General VisionKit TextKit Vision Core Text

Replies: 0
Boosts: 0
Views: 752
Activity: Jan ’22

Vision : VNDetectRectanglesRequest Error in iOS15

Machine Learning & AI General VisionKit Vision

Replies: 4
Boosts: 0
Views: 1.4k
Activity: Jan ’22

How to get the saliency mask of the VNDetectDocumentSegmentationRequest

Machine Learning & AI General VisionKit wwdc21-10041

Replies: 1
Boosts: 0
Views: 1.1k
Activity: Dec ’21

VNRecognizedText confidence values are only 0.5 and 1.0

Machine Learning & AI General VisionKit Vision Machine Learning

Replies: 0
Boosts: 0
Views: 1.2k
Activity: Nov ’21

Recognizing text

Media Technologies Photos & Camera VisionKit Camera Core Graphics

Replies: 2
Boosts: 0
Views: 1.2k
Activity: Nov ’21

VisionKit

Posts under VisionKit tag

Post

Replies

Boosts

Views

Activity