[tags:machine learning,vision]

102 results found

Post marked as solved
5 Replies
Hi Eugene, You might be running into issues because the app doesn't run properly on Simulator. The 2nd section of the sample’s README - https://developer.apple.com/documentation/vision/building_a_feature-rich_app_for_sports_analysis says: Configure the Project and Prepare Your Environment You must run the sample app on a physical device with an A12 processor or later, running iOS 14. The sample relies on hardware features that aren't available in Simulator, such as the Apple Neural Engine. If you have a physical device that meets those requirements, try running the sample there and let us know how it goes. 🙂
Post not yet marked as solved
3 Replies
I have the same issue on iPhone 11: Error Domain=com.apple.vis Code=12 processing with VNANERuntimeProcessingDevice is not supported. Any idea?
Post not yet marked as solved
3 Replies
I replied on Stack Overflow, as you asked there too.
Post not yet marked as solved
1 Replies
696 Views
Hi, all. I'm currently writing for a thesis about Computer Vision. Core ML Vision is the subject matter for the thesis. Now, I have a ridiculously simple question that I CANNOT find anywhere in the ML Vision Documentation: What is the underlying method used for classifying the image? Google's Auto ML explain this thing just two pages in. It uses Convolutional Neural Network. Can anyone found what method used for Apple's Image Classifier and Object Detector? Spesifically a model created using Apple Create ML.
Posted
by
Post not yet marked as solved
1 Replies
It is using a proprietary deep convolutional model
Post not yet marked as solved
1 Replies
Hello, The ml model for this feature is not available, if you feel that you need the ml model then please file an enhancement request using Feedback Assistant. - https://developer.apple.com/bug-reporting/ You can certainly perform VNDetectHumanHandPoseRequests while an ARSession is running. The process would be similar to what is demonstrated in this sample project: https://developer.apple.com/documentation/arkit/tracking_and_altering_images Instead of using Vision to detect rectangles as shown in the sample project, you will use Vision vision to detect the hand pose.
Post not yet marked as solved
3 Replies
You can visualize some form of the result by reading the pixelBuffer from each VNPixelBufferObservation returned by the VNGenerateOpticalFlowRequest. For me, those results aren't super helpful, so I would also love to see the custom kernel that shows the magnitude and direction of the motion. Has anyone been able to track it down?
Post not yet marked as solved
0 Replies
715 Views
Mostly with Chinese characters Vision recognize a line of text as a single 'word' when in fact there could be 2 or more. For ex, this string (肖丹销售部銷售经理) includes a name (first 2 char) and a job title (everything else). The first 2 characters have a height about twice the size of the others. I've been trying to break this string into 2, but I can't find a way to do it as the bounding box relates to the whole 'word' and not each character. If I could get each character's bounding box I could compare them and decide to make multiple strings when appropriate. I also tried to run VNDetectTextRectanglesRequest but the results don't always match (rarely actually) what you get with VNRecognizeTextRequest. For ex these 9 characters return 12 VNTextObservation. Anyone has an idea? Thanks.
Posted
by
LCG
Post not yet marked as solved
0 Replies
734 Views
The new VNGeneratePersonSegmentationRequest is a stateful request, i.e. it keeps state and improves the segmentation mask generation for subsequent frames. There is also the new CIPersonSegmentationFilter as a convenient way for using the API with Core Image. But since the Vision request is stateful, I was wondering how this is handled by the Core Image filter. Does the filter also keep state between subsequent calls? How is the The request requires the use of CMSampleBuffers with timestamps as input requirement of VNStatefulRequest ensured?
Posted
by
Post marked as solved
2 Replies
953 Views
Hi, I have two questions regarding the ActionAndVision sample application. After setting up the live AVSession how exactly and in which function is the sample buffer given to a visonhandler to perform a vision request. (for e.g. the getLastThrowType request) When and how is the captureOutput(...) func in the CameraViewController called? (line 268 ff) I appreciate any help, thank you very much.
Posted
by
Post not yet marked as solved
1 Replies
I found the code on Apple Developer app. Thank you!
Post not yet marked as solved
0 Replies
422 Views
I'm referring to this talk: https://developer.apple.com/videos/play/wwdc2021/10152 I was wondering if the code for the Image composition project he demonstrates at the end of the talk (around 24:00) is available somewhere? Would much appreciate any help.
Posted
by
Post not yet marked as solved
0 Replies
759 Views
We're well into COVID times now so building vision app involving people wearing masks should be expected. Vision's face rectangle detector works perfectly fine on faces with masks, but that's not the case for face landmarks. Even when someone is wearing a mask, there are still a lot of landmarks exposed (e.g., pupils, eyes, nose, eyebrows, etc.). When can expect face landmark detection to work on faces with masks?
Posted
by
Post not yet marked as solved
0 Replies
368 Views
Hello, I am Pieter Bikkel. I study Software Engineering at the HAN, University of Applied Sciences, and I am working on an app that can recognize volleyball actions using Machine Learning. A volleyball coach can put an iPhone on a tripod and analyze a volleyball match. For example, where the ball always lands in the field, how hard the ball is served. I was inspired by this session and wondered if I could interview one of the experts in this field. This would allow me to develop my App even better. I hope you can help me with this.
Posted
by