VisionKit

RSS for tag

Scan documents with the camera on iPhone and iPad devices using VisionKit.

Posts under VisionKit tag

169 Posts

Post

Replies

Boosts

Views

Activity

VisionKit crashes on iOS 16.4.
App crashes on iOS 16.4 when there is usage for ImageAnalysisInteraction api from VisionKit. App crashes before even starts. Here is output: dyld[3240]: Symbol not found: _$s9VisionKit24ImageAnalysisInteractionC7subject2atAC7SubjectVSgSo7CGPointV_tYaFTu Referenced from: <BAD7A699-FB4E-3D0E-8CD4-45CC9FC3D5E5> /Users/sereza/Library/Developer/CoreSimulator/Devices/B64EAF39-0DD9-49EC-A3F7-69675C94B8BE/data/Containers/Bundle/Application/F4E30E86-ED4D-4748-AB99-434208D55483/VisionKitChecker.app/VisionKitChecker Expected in: <F05E3A17-D74A-3EE2-BC8D-DDCC23E48707> /Library/Developer/CoreSimulator/Volumes/iOS_20E247/Library/Developer/CoreSimulator/Profiles/Runtimes/iOS 16.4.simruntime/Contents/Resources/RuntimeRoot/System/Library/Frameworks/VisionKit.framework/VisionKit Here is enough code to produce this crash. Please note that this code never gets called. It is enough that it exists in the project: import VisionKit @MainActor final class LiftHelper: ObservableObject { func doSomething() async throws { let interaction = ImageAnalysisInteraction() let _ = try await interaction.image(for: []) } }
3
1
836
Jul ’24
Vision Pro OS file location
I would like to know what is the global path of the Vision Pro file system. For instance, if I put a file called example.pdf inside "On My Apple Vision Pro" what would be the global path for that file? "On My Apple Vision Pro/user_name/example.pdf" or "/example.pdf" or "/username/example.pdf" and so on. I tried to search about it but I didn't found no official source about it. Thanks in advance!
1
0
1.2k
Jul ’24
Can you match a new photo with existing images?
I'm looking for a solution to take a picture or point the camera at a piece of clothing and match that image with an image the user has stored in my app. I'm storing the data in a Core Data database as a Binary Data object. Since the user also takes the pictures they store in the database I think I cannot use pre-trained Core ML models. I would like the matching to be done on device if possible instead of going to an external service. That will probably describe the item based on what the AI sees, but then I cannot match the item with the stored images in the app. Does anyone know if this is possible with frameworks as Vision or VisionKit?
2
0
1.2k
Jul ’24
Capture Video from my own app using enterprise APIs in visionOS
Hello, I want to capture video from Vision Pro in the Vision OS app. I am referring to the (https://developer.apple.com/videos/play/wwdc2024/10139/) Apple video and their code. step like below import ARKit com.apple.developer.arkit.main-camera-access.allow = true in info.plist Do below code func loadCameraFeed() async { // Main Camera Feed Access Example let formats = CameraVideoFormat.supportedVideoFormats(for: .main, cameraPositions:[.left]) let cameraFrameProvider = CameraFrameProvider() var arKitSession = ARKitSession() var pixelBuffer: CVPixelBuffer? await arKitSession.queryAuthorization(for: [.cameraAccess]) do { try await arKitSession.run([cameraFrameProvider]) } catch { return } guard let cameraFrameUpdates = cameraFrameProvider.cameraFrameUpdates(for: formats[0]) else { return } print(cameraFrameUpdates) for await cameraFrame in cameraFrameUpdates { print(cameraFrame) guard let mainCameraSample = cameraFrame.sample(for: .left) else { continue } pixelBuffer = mainCameraSample.pixelBuffer } } I want to convert "pixelBuffer" into video streaming and show it in a frame like iOS. Please guide me on how to achieve my next step. I am blank after this code.
1
0
1.4k
Jun ’24
Progressive immersive space and Digital Crown (and ARKit)
I am new to visionOS development, just slowly figuring out the difference in immersion styles to figure out how I want my app to behave. It seems that when you use a progressive immersive space the minimum immersion level (set via the digital crown) is not 0? Meaning, there is no way to go from mixed to full by using the Digital Crown. Even when I try to set it to 0 (such as in the Destination Video sample), it pops back up to around 30-40%, and I always see the background. Is this expected behavior, or are there some settings that allow me to change this minimum immersion level? Further, in the video 'Meet ARKit for spatial computing', it is stated that to get access to ARKit tracking data you must use a 'Full Space', not the 'Shared Space'. This wording is confusing to me. Is an ImmersiveSpace set to the .mixed (or .progressive) immersion style still a 'Full Space' (because it isn't in the shared space, with other apps)? OR, is ARKit only available in an ImmersiveSpace with the .full immersion style? Just feels like maybe 'full' is being used in two different ways here... Thanks in advance, -pj
2
0
1.7k
Jun ’24
Visionkit can lift a subject. But the bounding rectangle is always returning x,y,width,height values as 0,0,0,0
In our app, we needed to use visionkit framework to lift up the subject from an image and crop it. Here is the piece of code: if #available(iOS 17.0, *) { let analyzer = ImageAnalyzer() let analysis = try? await analyzer.analyze(image, configuration: self.visionKitConfiguration) let interaction = ImageAnalysisInteraction() interaction.analysis = analysis interaction.preferredInteractionTypes = [.automatic] guard let subject = await interaction.subjects.first else{ return image } let s = await interaction.subjects print(s.first?.bounds) guard let cropped = try? await subject.image else { return image } return cropped } But the s.first?.bounds always returns a cgrect with all 0 values. Is there any other way to get the position of the cropped subject? I need the position in the image from where the subject was cropped. Can anyone help?
1
0
1.2k
May ’24
Specific barcode not recognized
I faced a problem during development that I could not scan Code39 barcode with iPad using Vision. A sample label I used for test has multiple Code39 barcode on it and I could scan almost all barcodes except for specific one. And when I use conventional barcode scanner and free apps to scan barcode, I could scan the barcode with no problem. I failed to scan the barcode only when I use Vision function. Has anyone faced similar situation? Do you know the cause why specific barcode could not be scanned with iPad with Vision?
0
0
797
May ’24
Failed to load 12K Panorama photo,Request help to solve, loading 5.7K is normal to read the image texture
extension Entity { func addPanoramicImage(for media: WRMedia) { let subscription=TextureResource.loadAsync(named:"image_20240425_201630").sink( receiveCompletion: { switch $0 { case .finished: break case .failure(let error): assertionFailure("(error)") } }, receiveValue: { [weak self] texture in guard let self = self else { return } var material = UnlitMaterial() material.color = .init(texture: .init(texture)) self.components.set(ModelComponent( mesh: .generateSphere(radius: 1E3), materials: [material] )) self.scale *= .init(x: -1, y: 1, z: 1) self.transform.translation += SIMD3(0.0, -1, 0.0) } ) components.set(Entity.WRSubscribeComponent(subscription: subscription)) } problem: case .failure(let error): assertionFailure("(error)") Thread 1: Fatal error: Error Domain=MTKTextureLoaderErrorDomain Code=0 "Image decoding failed" UserInfo={NSLocalizedDescription=Image decoding failed, MTKTextureLoaderErrorKey=Image decoding failed}
2
0
864
May ’24
Reading 12K panoramic images system API read error, VisionOS does not support 12K panoramic photos view
xtension Entity { func addPanoramicImage(for media: WRMedia) { let subscription = TextureResource.loadAsync(named:"image_20240425_201630").sink( receiveCompletion: { switch $0 { case .finished: break case .failure(let error): assertionFailure("(error)") } }, receiveValue: { [weak self] texture in guard let self = self else { return } var material = UnlitMaterial() material.color = .init(texture: .init(texture)) self.components.set(ModelComponent( mesh: .generateSphere(radius: 1E3), materials: [material] )) self.scale *= .init(x: -1, y: 1, z: 1) self.transform.translation += SIMD3(0.0, -1, 0.0) } ) components.set(Entity.WRSubscribeComponent(subscription: subscription)) } func updateRotation(for media: WRMedia) { let angle = Angle.degrees( 0.0) let rotation = simd_quatf(angle: Float(angle.radians), axis: SIMD3<Float>(0, 0.0, 0)) self.transform.rotation = rotation } struct WRSubscribeComponent: Component { var subscription: AnyCancellable } } case .failure(let error): assertionFailure("(error)") Thread 1: Fatal error: Error Domain=MTKTextureLoaderErrorDomain Code=0 "Image decoding failed" UserInfo={NSLocalizedDescription=Image decoding failed, MTKTextureLoaderErrorKey=Image decoding failed}
1
0
851
Apr ’24
Noob seeking help, visionkit implementation!
Hi all apple devs! I am a young developer who is completely new to everything programming. I am currently trying to develop an app where I want to use visionkit, but I can't for the life of me figure out how to implement its features. I've been stuck on this for several days, so I am now resorting to asking all of you experts for help! Your assistance would be immensely appreciated! I started to develop the app trying to exclusively use swiftUI to futureproof my app. Upon figuring out what visionkit is, to my understanding it is more compatible with UIkit? So I rewrote the part of my code that will use visionkit into a UIkit based view, to simplify the integration of visionkits features. It might just have overcomplicated my code? Can visionkit be easily implemented using only swiftUI? I noticed in the demo on the video tutorial the code is in a viewcontroller not a contentview, is this what makes my image unresponsive? My image is not interactable like her demo in the video, where in my code do I go wrong? Help a noob out! The desired user flow is like this: User selects an image through the "Open camera" or "Open Camera Roll" buttons. Upon selection the UIkit based view opens and the selected image is displayed on it. (This is where I want to implement visionkit features) User interacts with the image by touching on it, if touching on a subject, the subject should be lifted out of the rest of the image and be assigned to the editedImage, which in turn displays only the subject without the background on the contentview. (For now the image is assigned to editedimage by longpressing without any subjectlifting since I cant get visionkit to work as I want) Anyways, here's a code snippet of my peculiar effort to implement subject lifting and visionkit into my app:
2
0
865
Apr ’24
What is the maximum data processing speed?
For example: we use DocKit for birdwatching, so we have an unknown field distance and direction. Distance = ? Direction = ? For example, the rock from which the observation is made. The task is to recognize the number of birds caught in the frame, add a detection frame and collect statistics. Question: What is the maximum number of frames processed with custom object recognition? If not enough, can I do the calculations myself and transfer to DokKit for fast movement?
0
0
943
Apr ’24
Can I obtain the position of the main window in Vision OS?
Is there a way to retrieve the position of the main window in Vision OS? I'd like to implement a feature where users can drag the window and have a 3D model follow it
Replies
1
Boosts
0
Views
907
Activity
Aug ’24
Vision & VistionKit Reading direction in iOS 18.0 beta not like as iOS 16, 17 version
I have tested my application in iOS 15, 16, 17 Version in that vision kit reading value in Horizontal direction once I got updated my device to iOS 18.0 beta value was reading as in vertical direction The build was generated in Xcode 13.4.1. Team please help to understand why this and need to change anything in code level
Replies
1
Boosts
0
Views
711
Activity
Aug ’24
Given an Image with varying contours create a stroke/outline
Essentially, I'm trying to find the most straightforward/simple way to outline an Image with varying contours. The intention is similar to the way iMessage allows you to add an outline to a sticker. The "goal" in the example is simply the input image on top of the outline.
Replies
0
Boosts
0
Views
629
Activity
Aug ’24
VisionKit crashes on iOS 16.4.
App crashes on iOS 16.4 when there is usage for ImageAnalysisInteraction api from VisionKit. App crashes before even starts. Here is output: dyld[3240]: Symbol not found: _$s9VisionKit24ImageAnalysisInteractionC7subject2atAC7SubjectVSgSo7CGPointV_tYaFTu Referenced from: <BAD7A699-FB4E-3D0E-8CD4-45CC9FC3D5E5> /Users/sereza/Library/Developer/CoreSimulator/Devices/B64EAF39-0DD9-49EC-A3F7-69675C94B8BE/data/Containers/Bundle/Application/F4E30E86-ED4D-4748-AB99-434208D55483/VisionKitChecker.app/VisionKitChecker Expected in: <F05E3A17-D74A-3EE2-BC8D-DDCC23E48707> /Library/Developer/CoreSimulator/Volumes/iOS_20E247/Library/Developer/CoreSimulator/Profiles/Runtimes/iOS 16.4.simruntime/Contents/Resources/RuntimeRoot/System/Library/Frameworks/VisionKit.framework/VisionKit Here is enough code to produce this crash. Please note that this code never gets called. It is enough that it exists in the project: import VisionKit @MainActor final class LiftHelper: ObservableObject { func doSomething() async throws { let interaction = ImageAnalysisInteraction() let _ = try await interaction.image(for: []) } }
Replies
3
Boosts
1
Views
836
Activity
Jul ’24
What is the technology of right clicking to automatically recognize objects in photos
Does anyone know which control is used to automatically recognize objects in photos and achieve the function of cutout by right-clicking the mouse?
Replies
1
Boosts
0
Views
833
Activity
Jul ’24
Vision Pro OS file location
I would like to know what is the global path of the Vision Pro file system. For instance, if I put a file called example.pdf inside "On My Apple Vision Pro" what would be the global path for that file? "On My Apple Vision Pro/user_name/example.pdf" or "/example.pdf" or "/username/example.pdf" and so on. I tried to search about it but I didn't found no official source about it. Thanks in advance!
Replies
1
Boosts
0
Views
1.2k
Activity
Jul ’24
右键自动识别照片中的对象是什么技术
有人知道这个鼠标点击右键自动识别照片中的对象然后可以实现抠图的功能用的是哪个控件吗?
Replies
0
Boosts
0
Views
606
Activity
Jul ’24
长按自动识别照片中的对象,并显示轮廓是什么功能
手机系统相册中有个长按识别对象的功能,这个功能在苹果开发中叫做什么,我应该使用哪个控件才能拥有这个功能?
Replies
0
Boosts
0
Views
716
Activity
Jul ’24
Long press to automatically recognize objects in photos and display the outline function
There is a long press recognition feature in the photo album of the mobile phone system. What is this feature called in Apple development, and which control should I use to have this feature?
Replies
1
Boosts
0
Views
740
Activity
Jul ’24
Can you match a new photo with existing images?
I'm looking for a solution to take a picture or point the camera at a piece of clothing and match that image with an image the user has stored in my app. I'm storing the data in a Core Data database as a Binary Data object. Since the user also takes the pictures they store in the database I think I cannot use pre-trained Core ML models. I would like the matching to be done on device if possible instead of going to an external service. That will probably describe the item based on what the AI sees, but then I cannot match the item with the stored images in the app. Does anyone know if this is possible with frameworks as Vision or VisionKit?
Replies
2
Boosts
0
Views
1.2k
Activity
Jul ’24
VNCalculateImageAestheticsScoresRequest not working on SIM
I try to use the new VNCalculateImageAestheticsScoresRequest API. Code is compiling and running but delivers the same result for every image Xcode 16 Beta 2 Simulator Did I missing anything ?
Replies
1
Boosts
0
Views
782
Activity
Jul ’24
Demonstrating Immersive video AVP apps
What is the best way to demonstrate or create 2D video to demonstrate an immersive video app? So far I've shared the AVP to my desktop Mac and screen captured the resulting view. Rather shaky at times. With visionOS 2.0 beta (2) is there a better way? Thanks, David
Replies
0
Boosts
0
Views
703
Activity
Jul ’24
Capture Video from my own app using enterprise APIs in visionOS
Hello, I want to capture video from Vision Pro in the Vision OS app. I am referring to the (https://developer.apple.com/videos/play/wwdc2024/10139/) Apple video and their code. step like below import ARKit com.apple.developer.arkit.main-camera-access.allow = true in info.plist Do below code func loadCameraFeed() async { // Main Camera Feed Access Example let formats = CameraVideoFormat.supportedVideoFormats(for: .main, cameraPositions:[.left]) let cameraFrameProvider = CameraFrameProvider() var arKitSession = ARKitSession() var pixelBuffer: CVPixelBuffer? await arKitSession.queryAuthorization(for: [.cameraAccess]) do { try await arKitSession.run([cameraFrameProvider]) } catch { return } guard let cameraFrameUpdates = cameraFrameProvider.cameraFrameUpdates(for: formats[0]) else { return } print(cameraFrameUpdates) for await cameraFrame in cameraFrameUpdates { print(cameraFrame) guard let mainCameraSample = cameraFrame.sample(for: .left) else { continue } pixelBuffer = mainCameraSample.pixelBuffer } } I want to convert "pixelBuffer" into video streaming and show it in a frame like iOS. Please guide me on how to achieve my next step. I am blank after this code.
Replies
1
Boosts
0
Views
1.4k
Activity
Jun ’24
Progressive immersive space and Digital Crown (and ARKit)
I am new to visionOS development, just slowly figuring out the difference in immersion styles to figure out how I want my app to behave. It seems that when you use a progressive immersive space the minimum immersion level (set via the digital crown) is not 0? Meaning, there is no way to go from mixed to full by using the Digital Crown. Even when I try to set it to 0 (such as in the Destination Video sample), it pops back up to around 30-40%, and I always see the background. Is this expected behavior, or are there some settings that allow me to change this minimum immersion level? Further, in the video 'Meet ARKit for spatial computing', it is stated that to get access to ARKit tracking data you must use a 'Full Space', not the 'Shared Space'. This wording is confusing to me. Is an ImmersiveSpace set to the .mixed (or .progressive) immersion style still a 'Full Space' (because it isn't in the shared space, with other apps)? OR, is ARKit only available in an ImmersiveSpace with the .full immersion style? Just feels like maybe 'full' is being used in two different ways here... Thanks in advance, -pj
Replies
2
Boosts
0
Views
1.7k
Activity
Jun ’24
Visionkit can lift a subject. But the bounding rectangle is always returning x,y,width,height values as 0,0,0,0
In our app, we needed to use visionkit framework to lift up the subject from an image and crop it. Here is the piece of code: if #available(iOS 17.0, *) { let analyzer = ImageAnalyzer() let analysis = try? await analyzer.analyze(image, configuration: self.visionKitConfiguration) let interaction = ImageAnalysisInteraction() interaction.analysis = analysis interaction.preferredInteractionTypes = [.automatic] guard let subject = await interaction.subjects.first else{ return image } let s = await interaction.subjects print(s.first?.bounds) guard let cropped = try? await subject.image else { return image } return cropped } But the s.first?.bounds always returns a cgrect with all 0 values. Is there any other way to get the position of the cropped subject? I need the position in the image from where the subject was cropped. Can anyone help?
Replies
1
Boosts
0
Views
1.2k
Activity
May ’24
Specific barcode not recognized
I faced a problem during development that I could not scan Code39 barcode with iPad using Vision. A sample label I used for test has multiple Code39 barcode on it and I could scan almost all barcodes except for specific one. And when I use conventional barcode scanner and free apps to scan barcode, I could scan the barcode with no problem. I failed to scan the barcode only when I use Vision function. Has anyone faced similar situation? Do you know the cause why specific barcode could not be scanned with iPad with Vision?
Replies
0
Boosts
0
Views
797
Activity
May ’24
Failed to load 12K Panorama photo,Request help to solve, loading 5.7K is normal to read the image texture
extension Entity { func addPanoramicImage(for media: WRMedia) { let subscription=TextureResource.loadAsync(named:"image_20240425_201630").sink( receiveCompletion: { switch $0 { case .finished: break case .failure(let error): assertionFailure("(error)") } }, receiveValue: { [weak self] texture in guard let self = self else { return } var material = UnlitMaterial() material.color = .init(texture: .init(texture)) self.components.set(ModelComponent( mesh: .generateSphere(radius: 1E3), materials: [material] )) self.scale *= .init(x: -1, y: 1, z: 1) self.transform.translation += SIMD3(0.0, -1, 0.0) } ) components.set(Entity.WRSubscribeComponent(subscription: subscription)) } problem: case .failure(let error): assertionFailure("(error)") Thread 1: Fatal error: Error Domain=MTKTextureLoaderErrorDomain Code=0 "Image decoding failed" UserInfo={NSLocalizedDescription=Image decoding failed, MTKTextureLoaderErrorKey=Image decoding failed}
Replies
2
Boosts
0
Views
864
Activity
May ’24
Reading 12K panoramic images system API read error, VisionOS does not support 12K panoramic photos view
xtension Entity { func addPanoramicImage(for media: WRMedia) { let subscription = TextureResource.loadAsync(named:"image_20240425_201630").sink( receiveCompletion: { switch $0 { case .finished: break case .failure(let error): assertionFailure("(error)") } }, receiveValue: { [weak self] texture in guard let self = self else { return } var material = UnlitMaterial() material.color = .init(texture: .init(texture)) self.components.set(ModelComponent( mesh: .generateSphere(radius: 1E3), materials: [material] )) self.scale *= .init(x: -1, y: 1, z: 1) self.transform.translation += SIMD3(0.0, -1, 0.0) } ) components.set(Entity.WRSubscribeComponent(subscription: subscription)) } func updateRotation(for media: WRMedia) { let angle = Angle.degrees( 0.0) let rotation = simd_quatf(angle: Float(angle.radians), axis: SIMD3<Float>(0, 0.0, 0)) self.transform.rotation = rotation } struct WRSubscribeComponent: Component { var subscription: AnyCancellable } } case .failure(let error): assertionFailure("(error)") Thread 1: Fatal error: Error Domain=MTKTextureLoaderErrorDomain Code=0 "Image decoding failed" UserInfo={NSLocalizedDescription=Image decoding failed, MTKTextureLoaderErrorKey=Image decoding failed}
Replies
1
Boosts
0
Views
851
Activity
Apr ’24
Noob seeking help, visionkit implementation!
Hi all apple devs! I am a young developer who is completely new to everything programming. I am currently trying to develop an app where I want to use visionkit, but I can't for the life of me figure out how to implement its features. I've been stuck on this for several days, so I am now resorting to asking all of you experts for help! Your assistance would be immensely appreciated! I started to develop the app trying to exclusively use swiftUI to futureproof my app. Upon figuring out what visionkit is, to my understanding it is more compatible with UIkit? So I rewrote the part of my code that will use visionkit into a UIkit based view, to simplify the integration of visionkits features. It might just have overcomplicated my code? Can visionkit be easily implemented using only swiftUI? I noticed in the demo on the video tutorial the code is in a viewcontroller not a contentview, is this what makes my image unresponsive? My image is not interactable like her demo in the video, where in my code do I go wrong? Help a noob out! The desired user flow is like this: User selects an image through the "Open camera" or "Open Camera Roll" buttons. Upon selection the UIkit based view opens and the selected image is displayed on it. (This is where I want to implement visionkit features) User interacts with the image by touching on it, if touching on a subject, the subject should be lifted out of the rest of the image and be assigned to the editedImage, which in turn displays only the subject without the background on the contentview. (For now the image is assigned to editedimage by longpressing without any subjectlifting since I cant get visionkit to work as I want) Anyways, here's a code snippet of my peculiar effort to implement subject lifting and visionkit into my app:
Replies
2
Boosts
0
Views
865
Activity
Apr ’24
What is the maximum data processing speed?
For example: we use DocKit for birdwatching, so we have an unknown field distance and direction. Distance = ? Direction = ? For example, the rock from which the observation is made. The task is to recognize the number of birds caught in the frame, add a detection frame and collect statistics. Question: What is the maximum number of frames processed with custom object recognition? If not enough, can I do the calculations myself and transfer to DokKit for fast movement?
Replies
0
Boosts
0
Views
943
Activity
Apr ’24