Hi,
I was watching this WWDC23 video on Metal with xrOS (https://developer.apple.com/videos/play/wwdc2023/10089/?time=1222). However, when I tried it, the Compositor Services API wasn't available. Is it ?
Or when will it be released ?
Thanks.
Post not yet marked as solved
Is there a framework that allows for classic image processing operations in real-time from incoming imagery from the front-facing cameras before they are displayed on the OLED screens? Things like spatial filtering, histogram equalization, and image warping. I saw the documentation for the Vision framework, but it seems to address high-level tasks, like object and recognition. Thank you!
Post not yet marked as solved
Is there a way to enable Wifi on the Vision Pro in its simulated environment and pass through devices connected to the local network or Mac?
Trying to use VNGeneratePersonSegmentationRequest.. it seems to work but the output mask isn't at the same resolution as the source image.. so comping the result with the source produces a bad result.
Not the full code, but hopefully enough to see what I'm doing.
var imageRect = CGRect(x: 0, y: 0, width: image.size.width, height: image.size.height)
let imageRef = image.cgImage(forProposedRect: &imageRect, context: nil, hints: nil)!
let request = VNGeneratePersonSegmentationRequest()
let handler = VNImageRequestHandler(cgImage: imageRef)
do {
try handler.perform([request])
guard let result = request.results?.first else {
return
}
//Is this the right way to do this?
let output = result.pixelBuffer
//This ciImage alpha mask is a different resolution than the source image
//So I don't know how to combine this with the source to cut out the foreground as they don't line up.. the res it's even the right aspect ratio.
let ciImage = CIImage(cvPixelBuffer: output)
.....
}
Post not yet marked as solved
Hi,
Can VisionPro see outside and let us take a picture from code/iOS/Programming? Eg: like we used to have camera permission on iPad/iPhone to have access to device camera.
Thank you
Post not yet marked as solved
Is there a way to move a Rigged Character with its Armature Bones in ARKit/RealityKit?
I am trying to do this
When I try to move using JointTransform the usdz robot provided in
https://developer.apple.com/documentation/arkit/arkit_in_ios/content_anchors/capturing_body_motion_in_3d
It gives me the following:
I see the documentation on Character Rigging etc. But is the movement through armature bones only available through a third party software. Or can it be done in Reality Kit/Arkit/RealityView?
https://developer.apple.com/documentation/arkit/arkit_in_ios/content_anchors/rigging_a_model_for_motion_capture
Is it possible to use import CreateML on an iOS project? I'm looking at the code form the "Build dynamic iOS apps with the Create ML framework" video from this link https://developer.apple.com/videos/play/wwdc2021/10037/, but I'm not sure what kind of project I need to create. If I created an iOS project and tried running the code, what inputs would I need?
Post not yet marked as solved
First of all this vision api is amazing. the OCR is very accurate.
I've been looking to multiprocess using the vision API. I have about 2 million PDFs I want to OCR, and I want to run multiple threads/run parallel processing to OCR each.
I tried pyobjc but it does not work so well. Any suggestions on tackling this problem?
Post not yet marked as solved
I am looking for the examples demo'd by Frank in session wwdc21-10041. I don't seem to find it anywhere. Any lead is appreciated.
Post not yet marked as solved
hi there,
i'm not sure if i'm missing something, but i've tried passing a variety of CGImages into SCSensitivityAnalyzer, incl ones which should be flagged as sensitive, and it always returns false. it doesn't throw an exception, and i have the Sensitive Content Warning enabled in settings (confirmed by checking the analysisPolicy at run time).
i've tried both the async and callback versions of analyzeImage.
this is with Xcode 15 beta 5.
i'm primarily testing on iOS/iPad simulators - is that a known issue?
cheers,
Mike
Post not yet marked as solved
Can you share the source code for the demo of the Vision Face Detector with the metrics (roll, yaw and pitch) displayed? You provide some code online but not for this portion of the presentation.
Post not yet marked as solved
When I customize the gesture interaction, how do I set the key value? It depends on the accuracy of finger joint recognition and distance detection. What is the accuracy of finger joint detection? discrimination and distance detection
Post not yet marked as solved
I'm trying to create a sky mask on pictures taken from my iPhone. I've seen in the documentation that CoreImage support semantic segmentation for Sky among other type for person (skin, hair etc...)
For now, I didn't found the proper workflow to use it.
First, I watched https://developer.apple.com/videos/play/wwdc2019/225/
I understood that images must be captured with the segmentation with this kind of code:
photoSettings.enabledSemanticSegmentationMatteTypes = self.photoOutput.availableSemanticSegmentationMatteTypes
photoSettings.embedsSemanticSegmentationMattesInPhoto = true
I capture the image on my iPhone, save it as HEIC format then later, I try to load the matte like that :
let skyMatte = CIImage(contentsOf: imageURL, options: [.auxiliarySemanticSegmentationSkyMatte: true])
Unfortunately, self.photoOutput.availableSemanticSegmentationMatteTypes always give me a list of types for person only and never a types Sky.
Anyway, the AVSemanticSegmentationMatte.MatteType is just [Hair, Skin, Teeth, Glasses] ... No Sky !!!
So, How am I supposed to use semanticSegmentationSkyMatteImage ?!? Is there any simple workaround ?
Post not yet marked as solved
Hi,
I want to control a hand model via hand motion capture.
I know there is a sample project and some articles about Rigging a Model for Motion Capture in ARKit document. BUT The solution is quite encapsulated in BodyTrackedEntity. I can't find appropriate Entity for controlling just a hand model.
By using VNDetectHumanHandPoseRequest provided by Vision framework, I can get hand joint info, but I don't know how to use that info in RealityKit to control a 3d hand model.
Do you know how to do that or do you have any idea on how should it be implemented?
Thanks
Post not yet marked as solved
I am trying to use VNDetectFaceRectanglesRequest to detect face bounding boxes on frames obtained by ARKit callbacks.
I have my app in Portrait Device Orientation and I am passing the .right orientation to perform method on VNSequenceRequestHandler
something like:
private let requestHandler = VNSequenceRequestHandler()
private var facePoseRequest: VNDetectFaceRectanglesRequest!
// ...
try? self.requestHandler.perform([self.facePoseRequest], on: currentBuffer, orientation: orientation)
Im setting .right for orientation above, in the hopes that Vision-Framework will re-orient before running inference.
Im trying to draw the returned BB on top of the Image. Here's my results processing code:
guard let faceRes = self.facePoseRequest.results?.first as? VNFaceObservation else {
return
}
//Option1: Assuming reported BB is in coordinate space of orientation-adjusted pixel buffer
// Problems/Observations:
// BoundingBox turns into a square with equal width and height
// Also BB does not cover entire face, but only from chin to eyes
//Notice Height & Width are flipped below
let flippedBB = VNImageRectForNormalizedRect(faceRes.boundingBox, currBufHeight, currBufWidth)
//vs
//Option2: Assuming, reported BB is in coordinate-system of original un-oriented pixel-buffer
// Problem/Observations:
// while the drawn BB does appear like a rectangle and covering most of the face, it is not always centered on the face.
// It moves around the screen when I tilt the device or my face.
let currBufWidth = CVPixelBufferGetWidth(currentBuffer)
let currBufHeight = CVPixelBufferGetHeight(currentBuffer)
let reportedBB = VNImageRectForNormalizedRect(faceRes.boundingBox, currBufWidth, currBufHeight)
In Option1 above:
BoundingBox becomes a square shape with Width and Height becoming equal. I noticed that the reported normalized BB has the same aspect ration as the Input Pixel Buffer, which is 1.33 . This is the reason that when I flip Width and Height params in VNImageRectForNormalizedRect, width and height become equal.
In Option2 above:
BB seems to be somewhat right height, it jumps around when I tilt the device or my head.
What coordinate system are the reported bounding boxes in?
Do I need to adjust for y-flippedness of Vision framework before I perform above operations?
What's the best way to draw these BB on the captured-frame and or ARview?
Thank you
Post not yet marked as solved
Hi, I am using Vision to detect fingers for counting. I am able to detect the finger joints. However when the finger is closed it also seems to detect the finger joints. So I am not able to differentiate and it is not reliable. I have tried to detect all the joints of a finger before determining the finger is raised, but doesn't seem to work. Any ideas on how I can improve the accuracy of detection of raised finger.
Thanks
Post not yet marked as solved
Hello, I have created a view with a 360 image full view, and I need to perform a task when the user clicks anywhere on the screen (leave the dome), but no matter what I try, it just does not work, it doesn't print anything at all.
import SwiftUI
import RealityKit
import RealityKitContent
struct StreetWalk: View {
@Binding var threeSixtyImage: String
@Binding var isExitFaded: Bool
var body: some View {
RealityView { content in
// Create a material with a 360 image
guard let url = Bundle.main.url(forResource: threeSixtyImage, withExtension: "jpeg"),
let resource = try? await TextureResource(contentsOf: url) else {
// If the asset isn't available, something is wrong with the app.
fatalError("Unable to load starfield texture.")
}
var material = UnlitMaterial()
material.color = .init(texture: .init(resource))
// Attach the material to a large sphere.
let streeDome = Entity()
streeDome.name = "streetDome"
streeDome.components.set(ModelComponent(
mesh: .generatePlane(width: 1000, depth: 1000),
materials: [material]
))
// Ensure the texture image points inward at the viewer.
streeDome.scale *= .init(x: -1, y: 1, z: 1)
content.add(streeDome)
}
update: { updatedContent in
// Create a material with a 360 image
guard let url = Bundle.main.url(forResource: threeSixtyImage,
withExtension: "jpeg"),
let resource = try? TextureResource.load(contentsOf: url) else {
// If the asset isn't available, something is wrong with the app.
fatalError("Unable to load starfield texture.")
}
var material = UnlitMaterial()
material.color = .init(texture: .init(resource))
updatedContent.entities.first?.components.set(ModelComponent(
mesh: .generateSphere(radius: 1000),
materials: [material]
))
}
.gesture(tap)
}
var tap: some Gesture {
SpatialTapGesture().targetedToAnyEntity().onChanged{ value in
// Access the tapped entity here.
print(value.entity)
print("maybe you can tap the dome")
// isExitFaded.toggle()
}
}
Post not yet marked as solved
Hello,
I am looking for something to allow me to Anchor a webview component to the user, as in it follows their line of vision as they move.
I tried using RealityView with an Anchor Entity, but it raises an error of "Presentations are not permitted within volumetric window scene". Can I anchor the Window instead?
Post not yet marked as solved
I have a simple swift script where I send a VNRecognizeTextRequest to recognize numbers in a selected area of the screen.
It works perfectly fine with macOS 13.5.1. However, the accuracy in macOS 14 is pretty low. Can this be related to the just Beta or some other changes?
Post not yet marked as solved
Hello, I am Pieter Bikkel. I study Software Engineering at the HAN, University of Applied Sciences, and I am working on an app that can recognize volleyball actions using Machine Learning. A volleyball coach can put an iPhone on a tripod and analyze a volleyball match. For example, where the ball always lands in the field, how hard the ball is served. I was inspired by this session and wondered if I could interview one of the experts in this field. This would allow me to develop my App even better. I hope you can help me with this.