[tags:machine learning,vision]

102 results found

Post not yet marked as solved
2 Replies
Thank you very much. Is any information about how to translate from vision bounding box cgrect to UIKit coordinate space in objective c? I didn't found any page with information. In the tutorial that you attach there are some function en swift that i can't translate to objective c. This one: let rectangles = boxesAndPayload.map { $0.box } .map { CGRect(origin: $0.origin.translateFromCoreImageToUIKitCoordinateSpace(using: image.size.height), size: $0.size) } Thank you for your time
Post not yet marked as solved
2 Replies
Thank you I'll take a look at that link. In general a suite of APIs similar to https://developer.apple.com/documentation/vision/identifying_3d_human_body_poses_in_images but in real time from visionOS data
Post marked as solved
5 Replies
Hi Eugene, You might be running into issues because the app doesn't run properly on Simulator. The 2nd section of the sample’s README - https://developer.apple.com/documentation/vision/building_a_feature-rich_app_for_sports_analysis says: Configure the Project and Prepare Your Environment You must run the sample app on a physical device with an A12 processor or later, running iOS 14. The sample relies on hardware features that aren't available in Simulator, such as the Apple Neural Engine. If you have a physical device that meets those requirements, try running the sample there and let us know how it goes. 🙂
Post not yet marked as solved
3 Replies
I have the same issue on iPhone 11: Error Domain=com.apple.vis Code=12 processing with VNANERuntimeProcessingDevice is not supported. Any idea?
Post not yet marked as solved
3 Replies
I replied on Stack Overflow, as you asked there too.
Post not yet marked as solved
1 Replies
699 Views
Hi, all. I'm currently writing for a thesis about Computer Vision. Core ML Vision is the subject matter for the thesis. Now, I have a ridiculously simple question that I CANNOT find anywhere in the ML Vision Documentation: What is the underlying method used for classifying the image? Google's Auto ML explain this thing just two pages in. It uses Convolutional Neural Network. Can anyone found what method used for Apple's Image Classifier and Object Detector? Spesifically a model created using Apple Create ML.
Posted
by
Post not yet marked as solved
1 Replies
It is using a proprietary deep convolutional model
Post not yet marked as solved
1 Replies
Hello, The ml model for this feature is not available, if you feel that you need the ml model then please file an enhancement request using Feedback Assistant. - https://developer.apple.com/bug-reporting/ You can certainly perform VNDetectHumanHandPoseRequests while an ARSession is running. The process would be similar to what is demonstrated in this sample project: https://developer.apple.com/documentation/arkit/tracking_and_altering_images Instead of using Vision to detect rectangles as shown in the sample project, you will use Vision vision to detect the hand pose.
Post not yet marked as solved
3 Replies
You can visualize some form of the result by reading the pixelBuffer from each VNPixelBufferObservation returned by the VNGenerateOpticalFlowRequest. For me, those results aren't super helpful, so I would also love to see the custom kernel that shows the magnitude and direction of the motion. Has anyone been able to track it down?
Post not yet marked as solved
0 Replies
717 Views
Mostly with Chinese characters Vision recognize a line of text as a single 'word' when in fact there could be 2 or more. For ex, this string (肖丹销售部銷售经理) includes a name (first 2 char) and a job title (everything else). The first 2 characters have a height about twice the size of the others. I've been trying to break this string into 2, but I can't find a way to do it as the bounding box relates to the whole 'word' and not each character. If I could get each character's bounding box I could compare them and decide to make multiple strings when appropriate. I also tried to run VNDetectTextRectanglesRequest but the results don't always match (rarely actually) what you get with VNRecognizeTextRequest. For ex these 9 characters return 12 VNTextObservation. Anyone has an idea? Thanks.
Posted
by
LCG
Post not yet marked as solved
0 Replies
739 Views
The new VNGeneratePersonSegmentationRequest is a stateful request, i.e. it keeps state and improves the segmentation mask generation for subsequent frames. There is also the new CIPersonSegmentationFilter as a convenient way for using the API with Core Image. But since the Vision request is stateful, I was wondering how this is handled by the Core Image filter. Does the filter also keep state between subsequent calls? How is the The request requires the use of CMSampleBuffers with timestamps as input requirement of VNStatefulRequest ensured?
Posted
by
Post marked as solved
2 Replies
956 Views
Hi, I have two questions regarding the ActionAndVision sample application. After setting up the live AVSession how exactly and in which function is the sample buffer given to a visonhandler to perform a vision request. (for e.g. the getLastThrowType request) When and how is the captureOutput(...) func in the CameraViewController called? (line 268 ff) I appreciate any help, thank you very much.
Posted
by