VisionKit

Scan documents with the camera on iPhone and iPad devices using VisionKit.

VisionKit Documentation

Post

Replies

Boosts

Views

Activity

Can I obtain the position of the main window in Vision OS?

Is there a way to retrieve the position of the main window in Vision OS? I'd like to implement a feature where users can drag the window and have a 3D model follow it

UI Frameworks SwiftUI Vision VisionKit visionOS

907

Aug ’24

Vision & VistionKit Reading direction in iOS 18.0 beta not like as iOS 16, 17 version

I have tested my application in iOS 15, 16, 17 Version in that vision kit reading value in Horizontal direction once I got updated my device to iOS 18.0 beta value was reading as in vertical direction The build was generated in Xcode 13.4.1. Team please help to understand why this and need to change anything in code level

Graphics & Games General Vision VisionKit

711

Aug ’24

Given an Image with varying contours create a stroke/outline

Essentially, I'm trying to find the most straightforward/simple way to outline an Image with varying contours. The intention is similar to the way iMessage allows you to add an outline to a sticker. The "goal" in the example is simply the input image on top of the outline.

UI Frameworks SwiftUI Vision SwiftUI VisionKit Core Image

629

Aug ’24

VisionKit crashes on iOS 16.4.

App crashes on iOS 16.4 when there is usage for ImageAnalysisInteraction api from VisionKit. App crashes before even starts. Here is output: dyld[3240]: Symbol not found: _$s9VisionKit24ImageAnalysisInteractionC7subject2atAC7SubjectVSgSo7CGPointV_tYaFTu Referenced from: <BAD7A699-FB4E-3D0E-8CD4-45CC9FC3D5E5> /Users/sereza/Library/Developer/CoreSimulator/Devices/B64EAF39-0DD9-49EC-A3F7-69675C94B8BE/data/Containers/Bundle/Application/F4E30E86-ED4D-4748-AB99-434208D55483/VisionKitChecker.app/VisionKitChecker Expected in: <F05E3A17-D74A-3EE2-BC8D-DDCC23E48707> /Library/Developer/CoreSimulator/Volumes/iOS_20E247/Library/Developer/CoreSimulator/Profiles/Runtimes/iOS 16.4.simruntime/Contents/Resources/RuntimeRoot/System/Library/Frameworks/VisionKit.framework/VisionKit Here is enough code to produce this crash. Please note that this code never gets called. It is enough that it exists in the project: import VisionKit @MainActor final class LiftHelper: ObservableObject { func doSomething() async throws { let interaction = ImageAnalysisInteraction() let _ = try await interaction.image(for: []) } }

Machine Learning & AI Apple Intelligence VisionKit

836

Jul ’24

What is the technology of right clicking to automatically recognize objects in photos

Does anyone know which control is used to automatically recognize objects in photos and achieve the function of cutout by right-clicking the mouse?

Media Technologies Photos & Camera VisionKit

833

Jul ’24

Vision Pro OS file location

I would like to know what is the global path of the Vision Pro file system. For instance, if I put a file called example.pdf inside "On My Apple Vision Pro" what would be the global path for that file? "On My Apple Vision Pro/user_name/example.pdf" or "/example.pdf" or "/username/example.pdf" and so on. I tried to search about it but I didn't found no official source about it. Thanks in advance!

App & System Services Core OS VisionKit visionOS Files and Storage

1.2k

Jul ’24

右键自动识别照片中的对象是什么技术

有人知道这个鼠标点击右键自动识别照片中的对象然后可以实现抠图的功能用的是哪个控件吗？

Media Technologies Photos & Camera VisionKit

606

Jul ’24

长按自动识别照片中的对象，并显示轮廓是什么功能

手机系统相册中有个长按识别对象的功能，这个功能在苹果开发中叫做什么，我应该使用哪个控件才能拥有这个功能？

Media Technologies Photos & Camera VisionKit

716

Jul ’24

Long press to automatically recognize objects in photos and display the outline function

There is a long press recognition feature in the photo album of the mobile phone system. What is this feature called in Apple development, and which control should I use to have this feature?

Media Technologies Photos & Camera VisionKit

740

Jul ’24

Can you match a new photo with existing images?

I'm looking for a solution to take a picture or point the camera at a piece of clothing and match that image with an image the user has stored in my app. I'm storing the data in a Core Data database as a Binary Data object. Since the user also takes the pictures they store in the database I think I cannot use pre-trained Core ML models. I would like the matching to be done on device if possible instead of going to an external service. That will probably describe the item based on what the AI sees, but then I cannot match the item with the stored images in the app. Does anyone know if this is possible with frameworks as Vision or VisionKit?

Machine Learning & AI General Vision Machine Learning VisionKit Core ML

1.2k

Jul ’24

VNCalculateImageAestheticsScoresRequest not working on SIM

I try to use the new VNCalculateImageAestheticsScoresRequest API. Code is compiling and running but delivers the same result for every image Xcode 16 Beta 2 Simulator Did I missing anything ?

Community Apple Developers Xcode Vision VisionKit

782

Jul ’24

Demonstrating Immersive video AVP apps

What is the best way to demonstrate or create 2D video to demonstrate an immersive video app? So far I've shared the AVP to my desktop Mac and screen captured the resulting view. Rather shaky at times. With visionOS 2.0 beta (2) is there a better way? Thanks, David

App Store Distribution & Marketing General VisionKit

703

Jul ’24

Capture Video from my own app using enterprise APIs in visionOS

Hello, I want to capture video from Vision Pro in the Vision OS app. I am referring to the (https://developer.apple.com/videos/play/wwdc2024/10139/) Apple video and their code. step like below import ARKit com.apple.developer.arkit.main-camera-access.allow = true in info.plist Do below code func loadCameraFeed() async { // Main Camera Feed Access Example let formats = CameraVideoFormat.supportedVideoFormats(for: .main, cameraPositions:[.left]) let cameraFrameProvider = CameraFrameProvider() var arKitSession = ARKitSession() var pixelBuffer: CVPixelBuffer? await arKitSession.queryAuthorization(for: [.cameraAccess]) do { try await arKitSession.run([cameraFrameProvider]) } catch { return } guard let cameraFrameUpdates = cameraFrameProvider.cameraFrameUpdates(for: formats[0]) else { return } print(cameraFrameUpdates) for await cameraFrame in cameraFrameUpdates { print(cameraFrame) guard let mainCameraSample = cameraFrame.sample(for: .left) else { continue } pixelBuffer = mainCameraSample.pixelBuffer } } I want to convert "pixelBuffer" into video streaming and show it in a frame like iOS. Please guide me on how to achieve my next step. I am blank after this code.

Spatial Computing ARKit ARKit Vision VisionKit visionOS

1.4k

Jun ’24

Progressive immersive space and Digital Crown (and ARKit)

I am new to visionOS development, just slowly figuring out the difference in immersion styles to figure out how I want my app to behave. It seems that when you use a progressive immersive space the minimum immersion level (set via the digital crown) is not 0? Meaning, there is no way to go from mixed to full by using the Digital Crown. Even when I try to set it to 0 (such as in the Destination Video sample), it pops back up to around 30-40%, and I always see the background. Is this expected behavior, or are there some settings that allow me to change this minimum immersion level? Further, in the video 'Meet ARKit for spatial computing', it is stated that to get access to ARKit tracking data you must use a 'Full Space', not the 'Shared Space'. This wording is confusing to me. Is an ImmersiveSpace set to the .mixed (or .progressive) immersion style still a 'Full Space' (because it isn't in the shared space, with other apps)? OR, is ARKit only available in an ImmersiveSpace with the .full immersion style? Just feels like maybe 'full' is being used in two different ways here... Thanks in advance, -pj

Spatial Computing ARKit ARKit VisionKit visionOS

1.7k

Jun ’24

Visionkit can lift a subject. But the bounding rectangle is always returning x,y,width,height values as 0,0,0,0

In our app, we needed to use visionkit framework to lift up the subject from an image and crop it. Here is the piece of code: if #available(iOS 17.0, *) { let analyzer = ImageAnalyzer() let analysis = try? await analyzer.analyze(image, configuration: self.visionKitConfiguration) let interaction = ImageAnalysisInteraction() interaction.analysis = analysis interaction.preferredInteractionTypes = [.automatic] guard let subject = await interaction.subjects.first else{ return image } let s = await interaction.subjects print(s.first?.bounds) guard let cropped = try? await subject.image else { return image } return cropped } But the s.first?.bounds always returns a cgrect with all 0 values. Is there any other way to get the position of the cropped subject? I need the position in the image from where the subject was cropped. Can anyone help?

Programming Languages Swift Swift VisionKit

1.2k

May ’24

Specific barcode not recognized

I faced a problem during development that I could not scan Code39 barcode with iPad using Vision. A sample label I used for test has multiple Code39 barcode on it and I could scan almost all barcodes except for specific one. And when I use conventional barcode scanner and free apps to scan barcode, I could scan the barcode with no problem. I failed to scan the barcode only when I use Vision function. Has anyone faced similar situation? Do you know the cause why specific barcode could not be scanned with iPad with Vision?

App & System Services General Vision VisionKit Live Text

797

May ’24

Failed to load 12K Panorama photo，Request help to solve, loading 5.7K is normal to read the image texture

extension Entity { func addPanoramicImage(for media: WRMedia) { let subscription=TextureResource.loadAsync(named:"image_20240425_201630").sink( receiveCompletion: { switch $0 { case .finished: break case .failure(let error): assertionFailure("(error)") } }, receiveValue: { [weak self] texture in guard let self = self else { return } var material = UnlitMaterial() material.color = .init(texture: .init(texture)) self.components.set(ModelComponent( mesh: .generateSphere(radius: 1E3), materials: [material] )) self.scale *= .init(x: -1, y: 1, z: 1) self.transform.translation += SIMD3(0.0, -1, 0.0) } ) components.set(Entity.WRSubscribeComponent(subscription: subscription)) } problem: case .failure(let error): assertionFailure("(error)") Thread 1: Fatal error: Error Domain=MTKTextureLoaderErrorDomain Code=0 "Image decoding failed" UserInfo={NSLocalizedDescription=Image decoding failed, MTKTextureLoaderErrorKey=Image decoding failed}

Graphics & Games RealityKit VisionKit RealityKit Reality Composer Pro visionOS

864

May ’24

Reading 12K panoramic images system API read error, VisionOS does not support 12K panoramic photos view

xtension Entity { func addPanoramicImage(for media: WRMedia) { let subscription = TextureResource.loadAsync(named:"image_20240425_201630").sink( receiveCompletion: { switch $0 { case .finished: break case .failure(let error): assertionFailure("(error)") } }, receiveValue: { [weak self] texture in guard let self = self else { return } var material = UnlitMaterial() material.color = .init(texture: .init(texture)) self.components.set(ModelComponent( mesh: .generateSphere(radius: 1E3), materials: [material] )) self.scale *= .init(x: -1, y: 1, z: 1) self.transform.translation += SIMD3(0.0, -1, 0.0) } ) components.set(Entity.WRSubscribeComponent(subscription: subscription)) } func updateRotation(for media: WRMedia) { let angle = Angle.degrees( 0.0) let rotation = simd_quatf(angle: Float(angle.radians), axis: SIMD3<Float>(0, 0.0, 0)) self.transform.rotation = rotation } struct WRSubscribeComponent: Component { var subscription: AnyCancellable } } case .failure(let error): assertionFailure("(error)") Thread 1: Fatal error: Error Domain=MTKTextureLoaderErrorDomain Code=0 "Image decoding failed" UserInfo={NSLocalizedDescription=Image decoding failed, MTKTextureLoaderErrorKey=Image decoding failed}

Media Technologies Video Vision Video VisionKit visionOS

851

Apr ’24

Noob seeking help, visionkit implementation!

Hi all apple devs! I am a young developer who is completely new to everything programming. I am currently trying to develop an app where I want to use visionkit, but I can't for the life of me figure out how to implement its features. I've been stuck on this for several days, so I am now resorting to asking all of you experts for help! Your assistance would be immensely appreciated! I started to develop the app trying to exclusively use swiftUI to futureproof my app. Upon figuring out what visionkit is, to my understanding it is more compatible with UIkit? So I rewrote the part of my code that will use visionkit into a UIkit based view, to simplify the integration of visionkits features. It might just have overcomplicated my code? Can visionkit be easily implemented using only swiftUI? I noticed in the demo on the video tutorial the code is in a viewcontroller not a contentview, is this what makes my image unresponsive? My image is not interactable like her demo in the video, where in my code do I go wrong? Help a noob out! The desired user flow is like this: User selects an image through the "Open camera" or "Open Camera Roll" buttons. Upon selection the UIkit based view opens and the selected image is displayed on it. (This is where I want to implement visionkit features) User interacts with the image by touching on it, if touching on a subject, the subject should be lifted out of the rest of the image and be assigned to the editedImage, which in turn displays only the subject without the background on the contentview. (For now the image is assigned to editedimage by longpressing without any subjectlifting since I cant get visionkit to work as I want) Anyways, here's a code snippet of my peculiar effort to implement subject lifting and visionkit into my app:

UI Frameworks UIKit iOS SwiftUI UIKit VisionKit

865

Apr ’24

What is the maximum data processing speed?

For example: we use DocKit for birdwatching, so we have an unknown field distance and direction. Distance = ? Direction = ? For example, the rock from which the observation is made. The task is to recognize the number of birds caught in the frame, add a detection frame and collect statistics. Question: What is the maximum number of frames processed with custom object recognition? If not enough, can I do the calculations myself and transfer to DokKit for fast movement?

Machine Learning & AI Create ML Vision VisionKit Create ML DockKit

943

Apr ’24

Can I obtain the position of the main window in Vision OS?

Is there a way to retrieve the position of the main window in Vision OS? I'd like to implement a feature where users can drag the window and have a 3D model follow it

UI Frameworks SwiftUI Vision VisionKit visionOS

Replies: 1
Boosts: 0
Views: 907
Activity: Aug ’24

Vision & VistionKit Reading direction in iOS 18.0 beta not like as iOS 16, 17 version

Graphics & Games General Vision VisionKit

Replies: 1
Boosts: 0
Views: 711
Activity: Aug ’24

Given an Image with varying contours create a stroke/outline

UI Frameworks SwiftUI Vision SwiftUI VisionKit Core Image

Replies: 0
Boosts: 0
Views: 629
Activity: Aug ’24

VisionKit crashes on iOS 16.4.

Machine Learning & AI Apple Intelligence VisionKit

Replies: 3
Boosts: 1
Views: 836
Activity: Jul ’24

What is the technology of right clicking to automatically recognize objects in photos

Does anyone know which control is used to automatically recognize objects in photos and achieve the function of cutout by right-clicking the mouse?

Media Technologies Photos & Camera VisionKit

Replies: 1
Boosts: 0
Views: 833
Activity: Jul ’24

Vision Pro OS file location

App & System Services Core OS VisionKit visionOS Files and Storage

Replies: 1
Boosts: 0
Views: 1.2k
Activity: Jul ’24

右键自动识别照片中的对象是什么技术

有人知道这个鼠标点击右键自动识别照片中的对象然后可以实现抠图的功能用的是哪个控件吗？

Media Technologies Photos & Camera VisionKit

Replies: 0
Boosts: 0
Views: 606
Activity: Jul ’24

长按自动识别照片中的对象，并显示轮廓是什么功能

手机系统相册中有个长按识别对象的功能，这个功能在苹果开发中叫做什么，我应该使用哪个控件才能拥有这个功能？

Media Technologies Photos & Camera VisionKit

Replies: 0
Boosts: 0
Views: 716
Activity: Jul ’24

Long press to automatically recognize objects in photos and display the outline function

There is a long press recognition feature in the photo album of the mobile phone system. What is this feature called in Apple development, and which control should I use to have this feature?

Media Technologies Photos & Camera VisionKit

Replies: 1
Boosts: 0
Views: 740
Activity: Jul ’24

Can you match a new photo with existing images?

Machine Learning & AI General Vision Machine Learning VisionKit Core ML

Replies: 2
Boosts: 0
Views: 1.2k
Activity: Jul ’24

VNCalculateImageAestheticsScoresRequest not working on SIM

I try to use the new VNCalculateImageAestheticsScoresRequest API. Code is compiling and running but delivers the same result for every image Xcode 16 Beta 2 Simulator Did I missing anything ?

Community Apple Developers Xcode Vision VisionKit

Replies: 1
Boosts: 0
Views: 782
Activity: Jul ’24

Demonstrating Immersive video AVP apps

App Store Distribution & Marketing General VisionKit

Replies: 0
Boosts: 0
Views: 703
Activity: Jul ’24

Capture Video from my own app using enterprise APIs in visionOS

Spatial Computing ARKit ARKit Vision VisionKit visionOS

Replies: 1
Boosts: 0
Views: 1.4k
Activity: Jun ’24

Progressive immersive space and Digital Crown (and ARKit)

Spatial Computing ARKit ARKit VisionKit visionOS

Replies: 2
Boosts: 0
Views: 1.7k
Activity: Jun ’24

Visionkit can lift a subject. But the bounding rectangle is always returning x,y,width,height values as 0,0,0,0

Programming Languages Swift Swift VisionKit

Replies: 1
Boosts: 0
Views: 1.2k
Activity: May ’24

Specific barcode not recognized

App & System Services General Vision VisionKit Live Text

Replies: 0
Boosts: 0
Views: 797
Activity: May ’24

Failed to load 12K Panorama photo，Request help to solve, loading 5.7K is normal to read the image texture

Graphics & Games RealityKit VisionKit RealityKit Reality Composer Pro visionOS

Replies: 2
Boosts: 0
Views: 864
Activity: May ’24

Reading 12K panoramic images system API read error, VisionOS does not support 12K panoramic photos view

Media Technologies Video Vision Video VisionKit visionOS

Replies: 1
Boosts: 0
Views: 851
Activity: Apr ’24

Noob seeking help, visionkit implementation!

UI Frameworks UIKit iOS SwiftUI UIKit VisionKit

Replies: 2
Boosts: 0
Views: 865
Activity: Apr ’24

What is the maximum data processing speed?

Machine Learning & AI Create ML Vision VisionKit Create ML DockKit

Replies: 0
Boosts: 0
Views: 943
Activity: Apr ’24

VisionKit

Posts under VisionKit tag

Post

Replies

Boosts

Views

Activity