Extract document data using Vision

RSS for tag

Discuss the WWDC21 session Extract document data using Vision.

View Session

Posts under wwdc21-10041 tag

7 Posts
Sort by:
Post not yet marked as solved
2 Replies
242 Views
Preciso muito resolver esse problema. O aplicativo da empresa foi criado na minha conta da apple, então quando o aplicativo está na apple store ele aparece com meu nome pessoal e eu gostaria d ocultar isso ou modificar o nome que aparece. Alguém sabe me dizer como faer isto?
Posted
by
Post not yet marked as solved
1 Replies
371 Views
In one of the WWDC videos, the VNDetectDocumentSegmentationRequest result is described in the following way: The result of the request is a low resolution segmentation mask, where each pixel represents a confidence if that pixel is part of the detected document or not. In addition it provides the four corner points of the quadrilateral. Similarly, in the VNDetectDocumentSegmentationRequest docs there's the following statement: The result that the request generates contains the four corner points of a document’s quadrilateral and saliency mask. So the first part ("four corner points of a document’s quadrilateral") is easy - it's in the results of the request, which are in VNRectangleObservation format: let request = VNDetectDocumentSegmentationRequest { (request, error) in guard let results = request.results as? [VNRectangleObservation] else { // Failed } // Process VNRectangleObservations } but how do I obtain the "low resolution segmentation mask" / "saliency mask" for VNDetectDocumentSegmentationRequest?
Posted
by
Post not yet marked as solved
2 Replies
390 Views
Hi, I have seen this video: https://developer.apple.com/videos/play/wwdc2021/10041/ and in my project i am trying to draw detected barcodes. I am using Vision framework and i have the barcode position in boundingBox parameter, but i dont understand cgrect of that parameter. I am programming in objective c and i don't see resources, and for more complication i have not an image, i am capturing barcodes from video camera sesion. for parts: 1-how can i draw detected barcode like in the video (from an image). 2-how can i draw detected barcode in capturesession. I have used VNImageRectForNormalizedRect to pass from normalized to pixel, but the result is not correct. thank you very much.
Posted
by
Post not yet marked as solved
0 Replies
452 Views
Try to run the sample code from the session: import Foundation import CoreImage import Vision import CoreML guard var inputImage = CIImage(contentsOf: #fileLiteral(resourceName: "IMG_0001.HEIC")) else { fatalError("image not found") } inputImage let requestHandler = VNImageRequestHandler(ciImage: inputImage) let documentDetectionRequest = VNDetectDocumentSegmentationRequest() try requestHandler.perform([documentDetectionRequest]) inputImage preview showed correctly, but requestHandler.perform failed with following errors: Error Domain=com.apple.vis Code=9 "failed to create image analyzer" UserInfo={NSLocalizedDescription=failed to create image analyzer, NSUnderlyingError=0x600002eac060 {Error Domain=com.apple.vis Code=9 "ImageAnalyzer failure with status 8539 (Espresso error)" UserInfo={NSLocalizedDescription=ImageAnalyzer failure with status 8539 (Espresso error)}}} please help.
Posted
by
Post not yet marked as solved
0 Replies
621 Views
I kinda like Espresso, but not so much in that context. https://github.com/stuffmc/ImageOCR ImageAnalyzer error -37:Espresso error in void vision::mod::ImageAnalyzer::initNetwork(const char *, const char *) @ /Library/Caches/com.apple.xbs/Sources/Vision_Sim/Vision-5.0.34/Libraries/cvml-Core/ImageAnalyzer/ImageAnalyzer.cpp:62 Error Domain=com.apple.vis Code=9 "failed to create image analyzer" UserInfo={NSLocalizedDescription=failed to create image analyzer, NSUnderlyingError=0x147091630 {Error Domain=com.apple.vis Code=9 "ImageAnalyzer failure with status 8539 (Espresso error)" UserInfo={NSLocalizedDescription=ImageAnalyzer failure with status 8539 (Espresso error)}}} Is it me doing something wrong?
Posted
by
Post not yet marked as solved
1 Replies
620 Views
Hey there, I am currently building an app which requires accurate text detection, including properly detecting text that is split up into multiple rows, as well as properly grouping paragraphs. The Live Text feature in the Camera app does exactly that, but will this also be added to the Vision framework, or is there already a way for doing this?
Posted
by