As the title already suggests, is it possible with the current Apple Vision Simulator to recognize objects/humans, like it is currently possible on the iPhone. I am not even sure, if we have an api for accessing the cameras of the Vision Pro?
My goal is, to recognize for example a human and add to this object an 3D object, for example a hat. Can this be done?
Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.
Post
Replies
Boosts
Views
Activity
I have a VisionOS app that streams 180VR video to create immersive experiences, similar to the Apple TV+ immersive content. These videos are 8k60fps. When looking at "video encoding requirements" (1.25) in the hls authoring specifications, there are only specifications for up to 4k30fps for MV-HEVC. Using the "power of .75 rule" I think that 160000kbps could be close to the 8k60fps recommendation.
For launching my app, I did my best guess in creating the following multivariant playlist, with a generous low end and very generous high end, with 6 second segment durations for all variants. However, in practice, users seem to be to only be picking up the lower quality bandwidth content (80000kbps) even when speed tests are showing 5x that on the device. This leads to a lot artifacts in the content during playback, since its lower quality. If I hard code a higher bit rate variant (like 240000kbps) it does playback, but obviously has a bit more lag to start up.
Now that I have my Vision Pro, I've been able to see the Apple TV+ immersive content. I could be wrong, but it doesn't feel like its varying playback - when watch by tethering to my phone vs on high speed wifi, the content looks the same just a little slower to load on the phone.
I'm looking for 3 points of guidance:
Are there hls recommendations for 8k60fps video? For both bitrates and target duration for the segments?
Any guesses as to why I am not able to pick up the high bitrates? (this could simply be that the higher ends are still too high)
While 180VR is just stereo video on a larger canvas, the viewing experience is quite different due to the immersion. Are there special recommendations for 180VR video? (Such as having only one variant and specified bitrate since a changing bitrate/video quality could be jarring to the viewer)
Example HLS multivariant playlist:
#EXTM3U
#EXT-X-STREAM-INF:AVERAGE-BANDWIDTH=80632574,BANDWIDTH=82034658,VIDEO-RANGE=SDR,CODECS="mp4a.40.2,hvc1.1.60000000.L183.B0",RESOLUTION=4096x4096,FRAME-RATE=59.940,CLOSED-CAPTIONS=NONE
https://myurl.com/stream/t4096p_080000_kbps_t6/prog_index.m3u8
#EXT-X-STREAM-INF:AVERAGE-BANDWIDTH=160782750,BANDWIDTH=162523567,VIDEO-RANGE=SDR,CODECS="mp4a.40.2,hvc1.1.60000000.L186.B0",RESOLUTION=4096x4096,FRAME-RATE=59.940,CLOSED-CAPTIONS=NONE
https://myurl.com/stream/t4096p_160000_kbps_t6/prog_index.m3u8
#EXT-X-STREAM-INF:AVERAGE-BANDWIDTH=240941602,BANDWIDTH=243997537,VIDEO-RANGE=SDR,CODECS="mp4a.40.2,hvc1.1.60000000.L186.B0",RESOLUTION=4096x4096,FRAME-RATE=59.940,CLOSED-CAPTIONS=NONE
https://myurl.com/stream/t4096p_240000_kbps_t6/prog_index.m3u8
#EXT-X-STREAM-INF:AVERAGE-BANDWIDTH=321061516,BANDWIDTH=325312545,VIDEO-RANGE=SDR,CODECS="mp4a.40.2,hvc1.1.60000000.H255.B0",RESOLUTION=4096x4096,FRAME-RATE=59.940,CLOSED-CAPTIONS=NONE
https://myurl.com/stream/t4096p_320000_kbps_t6/prog_index.m3u8
I have AVSpeechSynthesizer built in to 6 apps for iPad/iOS that were working fine until recently. Sometime between November 2023 and Feb 2024, they just quit speaking on all the apps for no apparent reason. There have been both XCode and iOS updates in the interim, but I cannot be sure which caused it. It doesn't work either in XCode on simulation, nor on devices.
What did Apple change?
XCode 15.2 iOS 17+ SwiftUI
let synth = AVSpeechSynthesizer()
var thisText = ""
func sayit(thisText: String) {
let utterance = AVSpeechUtterance(string: thisText)
utterance.voice = AVSpeechSynthesisVoice(language:"en-US")
utterance.rate = 0.4
utterance.preUtteranceDelay = 0.1
synth.speak(utterance)}
Hello, we are embedding a PHPickerViewController with UIKit (adding the vc as a child vc, embedding the view, calling didMoveToParent) in our app using the compact mode. We are disabling the following capabilities .collectionNavigation, .selectionActions, .search.
One of our users using iOS 17.2.1 and iPhone 12 encountered a crash with the following stacktrace:
Crashed: com.apple.main-thread
0 libsystem_kernel.dylib 0x9fbc __pthread_kill + 8
1 libsystem_pthread.dylib 0x5680 pthread_kill + 268
2 libsystem_c.dylib 0x75b90 abort + 180
3 PhotoFoundation 0x33b0 -[PFAssertionPolicyCrashReport notifyAssertion:] + 66
4 PhotoFoundation 0x3198 -[PFAssertionPolicyComposite notifyAssertion:] + 160
5 PhotoFoundation 0x374c -[PFAssertionPolicyUnique notifyAssertion:] + 176
6 PhotoFoundation 0x2924 -[PFAssertionHandler handleFailureInFunction:file:lineNumber:description:arguments:] + 140
7 PhotoFoundation 0x3da4 _PFAssertFailHandler + 148
8 PhotosUI 0x22050 -[PHPickerViewController _handleRemoteViewControllerConnection:extension:extensionRequestIdentifier:error:completionHandler:] + 1356
9 PhotosUI 0x22b74 __66-[PHPickerViewController _setupExtension:error:completionHandler:]_block_invoke_3 + 52
10 libdispatch.dylib 0x26a8 _dispatch_call_block_and_release + 32
11 libdispatch.dylib 0x4300 _dispatch_client_callout + 20
12 libdispatch.dylib 0x12998 _dispatch_main_queue_drain + 984
13 libdispatch.dylib 0x125b0 _dispatch_main_queue_callback_4CF + 44
14 CoreFoundation 0x3701c __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ + 16
15 CoreFoundation 0x33d28 __CFRunLoopRun + 1996
16 CoreFoundation 0x33478 CFRunLoopRunSpecific + 608
17 GraphicsServices 0x34f8 GSEventRunModal + 164
18 UIKitCore 0x22c62c -[UIApplication _run] + 888
19 UIKitCore 0x22bc68 UIApplicationMain + 340
20 WorkAngel 0x8060 main + 20 (main.m:20)
21 ??? 0x1bd62adcc (Missing)
Please share if you have any ideas as to what might have caused that, or what to look at in such a case. I haven't been able to reproduce this myself unfortunately.
Is AVQT capable of being used to measure encoding quality of PQ or HLG based content beyond SDR? If so, how am I able to leverage it. If not, is there a roadmap for timing to enable this type of tool?
Hello, I don't run a podcast, therefore, I am not referring to the apple podcasts connect platform, and have been trying to get in contact with someone at Apple Podcasts. I would like to talk to developer support or someone who could consult on how to best approach something i'd like to build as an open source tool. I listen to a lot of podcasts and would like an analytics dashboard and toolset to take notes from the podcasts that I listen to on Apple Podcasts. Although, it would be just a good start to have analytics, accessing all of the info. I need to be able to plug into an API and pull all of that data from my account. Is there any way I can access this or talk to someone about this? I have a lot of historical data I assume from all of the shows i'm subscribed to and would like to visualize all of this. Is this possible? From my research, it seems that there is no way to access the information from the Podcasts app? Is there any infra for this?
I can't figure out how to get audio from my RealityKitContentBundle to play on Vision Pro...
I have a scene in Reality Composer Pro called "WinterVivarium" which contains a 3D model of a tree, a particle emitter, a ChannelAudio entity, and an audio file (m4a) with 30 minutes of nature sounds.
The 3D model and particle emitter load up just fine on my device, but I'm getting an error when I try to load the audio...
Swift file below. When I run the app and this file gets called it throws the following error:
"Error loading winter vivarium model and/or audio: The operation couldn’t be completed. (RealityKit.__REAsset.LoadError error 2.)"
ChatGPT tells me error code 2 likely means "file not found" but I'm not sure on that one...
Please help!
import SwiftUI
import RealityKit
import RealityKitContent
struct WinterVivarium: View {
@State private var angle: Angle = .degrees(0)
var body: some View {
RealityView { content in
let audioFilePath = "/Root/back-yard-feb-7am.m4a"
let audioEntity = Entity()
do {
let entity = try await Entity(named: "WinterVivarium", in: realityKitContentBundle)
content.add(entity)
let resource = try await AudioFileResource.load(named: audioFilePath, from: "WinterVivarium.usda", in: RealityKitContent.RealityKitContentBundle)
let audioController = audioEntity.playAudio(resource)
} catch {
print("Error loading winter vivarium model and/or audio: \(error.localizedDescription)")
}
}
}
#Preview {
WinterVivarium()
}
Hello Community,
I plan an app to correct a specific child's behavior.
For this to work, the app needs to run in the background and will be triggered to use the front camera when a pre-defined app is on screen (YouTube as an example). Snap photos will be taken for image processing before their deletion every few seconds. When the specific behavior is found, the app will take down the device volume (and put it back when "fixed"). The user photos/data are deleted and nothing is sent, saved, or shared.
My main concern is that the app is always in the background and using the camera frequently.
I'm unsure if that is possible/allowed, and if so, how stable will it be. But
most importantly, I do not want this code activity to be found suspicious when uploading the app to the store.
Hope this is clear.
I would appreciate an advice.
Thanks,
Avi
I'm looking for the AVAudioEngine in Practice video from WWDC 2014 (session 502) but can't seem to find it anywhere.
Does anyone have a link to this session video? I can only find the slides. Thanks.
I have a PCM audio buffer (AVAudioPCMFormatInt16). When I try to play it using AVPlayerNode / AVAudioEngine an exception is thrown:
"[[busArray objectAtIndexedSubscript:(NSUInteger)element] setFormat:format error:&nsErr]: returned false, error Error Domain=NSOSStatusErrorDomain Code=-10868
(related thread https://forums.developer.apple.com/forums/thread/700497?answerId=780530022#780530022)
If I convert the buffer to AVAudioPCMFormatFloat32 playback works.
My questions are:
Does AVAudioEngine / AVPlayerNode require AVAudioPCMBuffer to be in the Float32 format? Is there a way I can configure it to accept another format instead for my application?
If 1 is YES is this documented anywhere?
If 1 is YES is this required format subject to change at any point?
Thanks!
I was looking to watch the "AVAudioEngine in Practice" session video from WWDC 2014 but I can't find it anywhere (https://forums.developer.apple.com/forums/thread/747008).
I'm developing an iOS application that uses Core Audio. When I'm running the app on Silicon Macbook, the first time I call AudioUnitSetProperty the following error is logged:
CARP violation: using HAL semantics (AUIOImpl_Base)
Are others getting this, and is this part of normal process?
I'm also getting AQMEIO_HAL.cpp:862 kAudioDevicePropertyMute returned err 2003332927 when I set kAudioOutputUnitProperty_EnableIO for input.
Hi,
I'm new to AVAudioEngine(and macOS programming in general).
I'm trying to mix microphone audio with ScreenCaptureKit audio using AVAudioEngine without playing it back. I've created a AVAudioPlayerNode and scheduling buffers in my SCStream handler:
playerNode.scheduleBuffer(samples)
and have connected the playerNode to the mainMixerNode.
audioEngine.connect(audioEngine.inputNode, to: audioEngine.mainMixerNode, format: micFormat)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: format)
The problem is that mainMixerNode plays the audio to the speaker creating a feedback loop. How can I prevent the mixer output from being played back.
Also:
Is this the best way of mixing microphone input with some other input? I ran into AVAudioEngine's manual rendering mode, which seems like the way to go for mixing audio without playing it back. However, I couldn't figure out how to connect microphone input to the AVAudioEngine in manual rendering mode?
I'm trying to add a USB mic to my Mini runing the latest Sonoma software but it full of crackles. Why isn't it clean?
Is there any way to play panoramic or 360 videos in an immersive space, without using VideoMaterial on a sphere?
I've tried using local videos with 4k and 8k quality and all of them look pixelated using this approach.
I tried both simulator as well as the real device, and I can't ever get a high-quality playback.
If the video is played on a regular 2D player, on the other hand, it shows the expected quality.
I am working on a radio app. This is the first time and I have a problem with lock Screen Audio Card. According to docs It looks ok but could you please check why I can not display Audio Now Playing Card on lock Screen.
2 Code samples, 1. Now Playing and 2. Logic of current song and Album art.
1. Now Playing
// Create a dictionary to hold the now playing information
var nowPlayingInfo: [String: Any] = [:]
// Set the title of the current song
nowPlayingInfo[MPMediaItemPropertyTitle] = currentSong
// If album art URL is available, fetch the image asynchronously
if let albumArtUrl = albumArtUrl {
URLSession.shared.dataTask(with: albumArtUrl) { data, _, error in
if let data = data, let image = UIImage(data: data) {
// Create artwork object
let artwork = MPMediaItemArtwork(boundsSize: image.size) { _ in image }
// Update now playing info with artwork on the main queue
DispatchQueue.main.async {
nowPlayingInfo[MPMediaItemPropertyArtwork] = artwork
MPNowPlayingInfoCenter.default().nowPlayingInfo = nowPlayingInfo
}
} else {
// If there's an error fetching the album art, set now playing info without artwork
MPNowPlayingInfoCenter.default().nowPlayingInfo = nowPlayingInfo
print("Error retrieving album art data:", error?.localizedDescription ?? "Unknown error")
}
}.resume()
} else {
// If album art URL is not available, set now playing info without artwork
MPNowPlayingInfoCenter.default().nowPlayingInfo = nowPlayingInfo
}
}
2. Current Song, Album Art Logic
let parts = currentSong.split(separator: "-", maxSplits: 1, omittingEmptySubsequences: true).map { $0.trimmingCharacters(in: .whitespaces) }
let titleWithExtra = parts.count > 1 ? parts[1] : ""
let title = titleWithExtra.components(separatedBy: " (").first ?? titleWithExtra
return title
}
func updateSongInfo() {
let url = URL(string: "https://live.heartfm.com.tr/listen/heart_fm/currentsong")!
URLSession.shared.dataTask(with: url) { data, response, error in
if let data = data, let songString = String(data: data, encoding: .utf8) {
DispatchQueue.main.async {
self.currentSong = songString.trimmingCharacters(in: .whitespacesAndNewlines)
self.updateAlbumArtUrl(song: self.currentSong)
}
}
}.resume()
}
private func updateAlbumArtUrl(song: String) {
let parts = song.split(separator: "-", maxSplits: 1, omittingEmptySubsequences: true).map { $0.trimmingCharacters(in: .whitespaces) }
let artist = parts.first ?? ""
let titleWithExtra = parts.count > 1 ? parts[1] : ""
let title = titleWithExtra.components(separatedBy: " (").first ?? titleWithExtra
let artistAndTitle = artist.isEmpty || title.isEmpty ? song : "\(artist) - \(title)"
let encodedArtistAndTitle = artistAndTitle.addingPercentEncoding(withAllowedCharacters: .urlQueryAllowed) ?? artistAndTitle
albumArtUrl = URL(string: "https://www.heartfm.com.tr/ArtCover/\(encodedArtistAndTitle).jpg")
}
Dear Sirs,
I've written an audio driver based on AudioDriverKit.
In my audio callback function I'm receiving calls with io operation IOUserAudioIOOperationWriteEnd and IOUserAudioIOOperationBeginRead as expected which means I see IOUserAudioIOOperationWriteEnd operations during a playback in an application like VLC or the browser and I see IOUserAudioIOOperationBeginRead when recording in Audacity etc..
But when I open the SystemSettings and goto Sound and I select my driver as input I also see calls with IOUserAudioIOOperationWriteEnd which seem to be the just read input data. I can also watch this when starting up Teams. I think the purpose is to add the (mic) input also to the output so you have the chance to listen to yourself.
Nevertheless I'd like to fully avoid this but I don't see a way to distinguish between the playback audio data and the input audio data inside this callback. How could I do this?
Or even better is there a switch which would completely switch off these callbacks which forward the input to the output?
Thanks and best regards,
Johannes
Does the new MV-HEVC vision pro spatial video format supports having an alpha channel? I've tried converting a side by side video with alpha channel enabled by using this Apple example project, but the alpha channel is being removed.
https://developer.apple.com/documentation/avfoundation/media_reading_and_writing/converting_side-by-side_3d_video_to_multiview_hevc
Hi!
For a couple of days and only for some users, we are getting this error message on this endpoint: https://api.music.apple.com/v1/me/library/playlists?limit=100
{"id":"6NT5LBXIZW65K2G3L6QY3WWYAA","title":"Upstream Service Error","detail":"Error fetching library content","status":"500","code":"50001"}
Any idea?
How to extract an object from a picture or remove the background of an object just like you can create stickers in Photos app. Is there any other official model or library other than using some website's API? (DeepLabV3.mlmodel cannot infer what I need)
Dear Sirs,
when writing an AudioServerPlugin I can use the hosts WriteToStorage/CopyFromStorage functions to save and restore custom properties on restarting the machine. Are there corresponding functions for an audio driver based on AudioDriverKit? What would be the recommended way to save and restore properties so that they are available again after a reboot in an audio driver based on AudioDriverKit?
Thanks and best regards,
Johannes