Hi, everyone, I downloaded the source code EditingSpatialAudioWithAnAudioMix.zip from https://developer.apple.com/documentation/Cinematic/editing-spatial-audio-with-an-audio-mix, when I carried out one of the actions named "process" in command line the program crashed!!
Form the source code, I found that the value of componentType is set to kAudioUnitType_FormatConverter:
// The actual `AudioUnit`.
public var auAudioMix = AVAudioUnitEffect()
init() {
// Generate a component description for the audio unit.
let componentDescription = AudioComponentDescription(
componentType: kAudioUnitType_FormatConverter,
componentSubType: kAudioUnitSubType_AUAudioMix,
componentManufacturer: kAudioUnitManufacturer_Apple,
componentFlags: 0,
componentFlagsMask: 0)
auAudioMix=AVAudioUnitEffect(audioComponentDescription: componentDescription)
}
But in the document from https://developer.apple.com/documentation/avfaudio/avaudiouniteffect/init(audiocomponentdescription:), it seems that componentType can not be set to kAudioUnitType_FormatConverter and :
Has everyone encountered this problem?
Audio
RSS for tagIntegrate music and other audio content into your apps.
Posts under Audio tag
77 Posts
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
I’m facing a problem while trying to achieve spatial audio effects in my iOS 18 app. I have tried several approaches to get good 3D audio, but the effect never felt good enough or it didn’t work at all.
Also what mostly troubles me is I noticed that AirPods I have doesn’t recognize my app as one having spatial audio (in audio settings it shows "Spatial Audio Not Playing"). So i guess my app doesn't use spatial audio potential.
First approach uses AVAudioEnviromentNode with AVAudioEngine. Chaining position of player as well as changing listener’s doesn’t seem to change anything in how audio plays.
Here's simple how i initialize AVAudioEngine
import Foundation
import AVFoundation
class AudioManager: ObservableObject {
// important class variables
var audioEngine: AVAudioEngine!
var environmentNode: AVAudioEnvironmentNode!
var playerNode: AVAudioPlayerNode!
var audioFile: AVAudioFile?
...
//Sound set up
func setupAudio() {
do {
let session = AVAudioSession.sharedInstance()
try session.setCategory(.playback, mode: .default, options: [])
try session.setActive(true)
} catch {
print("Failed to configure AVAudioSession: \(error.localizedDescription)")
}
audioEngine = AVAudioEngine()
environmentNode = AVAudioEnvironmentNode()
playerNode = AVAudioPlayerNode()
audioEngine.attach(environmentNode)
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: environmentNode, format: nil)
audioEngine.connect(environmentNode, to: audioEngine.mainMixerNode, format: nil)
environmentNode.listenerPosition = AVAudio3DPoint(x: 0, y: 0, z: 0)
environmentNode.listenerAngularOrientation = AVAudio3DAngularOrientation(yaw: 0, pitch: 0, roll: 0)
environmentNode.distanceAttenuationParameters.referenceDistance = 1.0 environmentNode.distanceAttenuationParameters.maximumDistance = 100.0
environmentNode.distanceAttenuationParameters.rolloffFactor = 2.0
// example.mp3 is mono sound
guard let audioURL = Bundle.main.url(forResource: "example", withExtension: "mp3") else {
print("Audio file not found")
return
}
do {
audioFile = try AVAudioFile(forReading: audioURL)
} catch {
print("Failed to load audio file: \(error)")
}
}
...
//Playing sound
func playSpatialAudio(pan: Float ) {
guard let audioFile = audioFile else { return }
// left side
playerNode.position = AVAudio3DPoint(x: pan, y: 0, z: 0)
playerNode.scheduleFile(audioFile, at: nil, completionHandler: nil)
do {
try audioEngine.start()
playerNode.play()
} catch {
print("Failed to start audio engine: \(error)")
}
...
}
Second more complex approach using PHASE did better. I’ve made an exemplary app that allows players to move audio player in 3D space. I have added reverb, and sliders changing audio position up to 10 meters each direction from listener but audio seems to only really change left to right (x axis) - again I think it might be trouble with the app not being recognized as spatial.
//Crucial class Variables:
class PHASEAudioController: ObservableObject{
private var soundSourcePosition: simd_float4x4 = matrix_identity_float4x4
private var audioAsset: PHASESoundAsset!
private let phaseEngine: PHASEEngine
private let params = PHASEMixerParameters()
private var soundSource: PHASESource
private var phaseListener: PHASEListener!
private var soundEventAsset: PHASESoundEventNodeAsset?
// Initialization of PHASE
init{
do {
let session = AVAudioSession.sharedInstance()
try session.setCategory(.playback, mode: .default, options: [])
try session.setActive(true)
} catch {
print("Failed to configure AVAudioSession: \(error.localizedDescription)")
}
// Init PHASE Engine
phaseEngine = PHASEEngine(updateMode: .automatic)
phaseEngine.defaultReverbPreset = .mediumHall
phaseEngine.outputSpatializationMode = .automatic //nothing helps
// Set listener position to (0,0,0) in World space
let origin: simd_float4x4 = matrix_identity_float4x4
phaseListener = PHASEListener(engine: phaseEngine)
phaseListener.transform = origin
phaseListener.automaticHeadTrackingFlags = .orientation
try! self.phaseEngine.rootObject.addChild(self.phaseListener)
do{
try self.phaseEngine.start();
}
catch {
print("Could not start PHASE engine")
}
audioAsset = loadAudioAsset()
// Create sound Source
// Sphere
soundSourcePosition.translate(z:3.0)
let sphere = MDLMesh.newEllipsoid(withRadii: vector_float3(0.1,0.1,0.1), radialSegments: 14, verticalSegments: 14, geometryType: MDLGeometryType.triangles, inwardNormals: false, hemisphere: false, allocator: nil)
let shape = PHASEShape(engine: phaseEngine, mesh: sphere)
soundSource = PHASESource(engine: phaseEngine, shapes: [shape])
soundSource.transform = soundSourcePosition
print(soundSourcePosition)
do {
try phaseEngine.rootObject.addChild(soundSource)
}
catch {
print ("Failed to add a child object to the scene.")
}
let simpleModel = PHASEGeometricSpreadingDistanceModelParameters()
simpleModel.rolloffFactor = rolloffFactor
soundPipeline.distanceModelParameters = simpleModel
let samplerNode = PHASESamplerNodeDefinition(
soundAssetIdentifier: audioAsset.identifier,
mixerDefinition: soundPipeline,
identifier: audioAsset.identifier + "_SamplerNode")
samplerNode.playbackMode = .looping
do {soundEventAsset = try
phaseEngine.assetRegistry.registerSoundEventAsset(
rootNode: samplerNode,
identifier: audioAsset.identifier + "_SoundEventAsset")
} catch {
print("Failed to register a sound event asset.")
soundEventAsset = nil
}
}
//Playing sound
func playSound(){
// Fire new sound event with currently set properties
guard let soundEventAsset else { return }
params.addSpatialMixerParameters(
identifier: soundPipeline.identifier,
source: soundSource,
listener: phaseListener)
let soundEvent = try! PHASESoundEvent(engine: phaseEngine,
assetIdentifier: soundEventAsset.identifier,
mixerParameters: params)
soundEvent.start(completion: nil)
}
...
}
Also worth mentioning might be that I only own personal team account
Hi there,
I recently launched a dj app to the mac app store, and was wondering how I could access songs for mixing purposes via Apple Music just like how serato, rekordbox, djay, and other DJ apps do?
Thanks,
Gunek
FaceTime’s screen-share audio balance is insanely absurd right now. Whenever I share media, the system audio that gets sent through FaceTime is a tiny whisper even at full volume (or even when connected to my speaker or headphones). The moment anyone on the call makes any noise at all, the shared audio ducks so hard it disappears, while the voice (or rustling or air conditioning noise) spikes to painful levels. It’s impossible to watch or listen to anything together. Also, the feature where FaceTime would shrink to a square during screen-sharing has been completely removed. That was a good feature and I'm really confused why it's gone. Now, the FaceTime window stays as a long rectangle that covers part of the content I'm trying to share (unless I do full screen tile, but then I can't pull up any other windows during the call) and can't be made smaller than about a third of the screen. You can't resize the window or adjust its dimensions, so it ends up blocking the actual media you're trying to watch.
Here are some feature requests/fixes that would greatly improve the FaceTime screen-share experience:
Option to adjust the shared media volume independently of call audio.
Disable/toggle the extreme automatic audio docking while screen-sharing
Reintroduce the minimized “floating square” mode or allow full manual resizing and repositioning of the FaceTime window during screen-share sessions.
Overall, this setup makes FaceTime screen-sharing basically unusable. The audio balance is so inconsistent that it’s easier to switch to Zoom or Google Meet, which both handle shared sound correctly and let you move the call window out of the way. Until these issues are fixed, there’s no practical reason to use FaceTime for shared viewing at all.
We require assistance in resolving a critical audio design conflict within our Push-to-Talk (PTT) application. Our current volume amplification strategy—which relies on applying a GAIN factor to PCM samples in conjunction with setting the AVAudioSession category to Playback—is working successfully when PTT is used independently. However, upon integrating and reporting the same PTT call through the CallKit framework, this amplification effect is lost. The CallKit integration appears to be forcing a different, non-amplifying audio session category or configuration, negatively impacting the user's perceived call volume. We need guidance on how to maintain the AVAudioSessionCategoryPlayback setting, or an equivalent high-volume configuration, while operating under the control of CallKit.
Hi. I am mixing content destined for Vision Pro. Locked to video. I have the AAX installer and the ASAF video player demonstrated in the quicktimes is nit included in the install package for pro tools. Would it be possible to post a link ?
I'm implementing the PushToTalk framework and have encountered an issue where channelManager(_:didActivate:) is not called under specific circumstances.
What works:
App is in foreground, receives PTT push → didActivate is called ✅
App receives audio in foreground, then is backgrounded → subsequent pushes trigger didActivate ✅
What doesn't work:
App is launched, user joins channel, then immediately backgrounds
PTT push arrives while app is backgrounded
incomingPushResult is called, I return .activeRemoteParticipant(participant)
The system UI shows the speaker name correctly
However, didActivate is never called
Audio data arrives via WebSocket but cannot be played (no audio session)
Setup:
Channel joined successfully before backgrounding
UIBackgroundModes includes push-to-talk
No manual audio session activation (setActive) anywhere in my code
AVAudioEngine setup only happens inside didActivate delegate method
Issue persists even after channel restoration via channelDescriptor(restoredChannelUUID:)
Question:
Is this expected behavior or a bug? If expected, what's the correct approach to handle
incoming PTT audio when the app is backgrounded and hasn't received audio while in the
foreground yet?
New to iOS development and I've been trying to make heads or tails of the documentation. I know there is a difference between the data fields returned from songs from the user library and from the category, but whenever I search on the apple site I can't find a list of each. For example, Im trying to get the releaseDate of a song in my library, but it seems I'll have to cross-query either the catalog entry for the using song.catalogID or the song.irsc but when I try to use them I can't find a cross reference between the two. I'm totally turned around.
Also trying to determine if a song in my library has been favorited or not? isFavorited (or something similar) doesn't seem to be a thing. Using this code and trying to find a way to display a solid star if the song has been favorited or an empty one if it's not. Seems like a basic request but I can't find anything on how to do it. I've searched docs, googled, tried.
Does apple want us to query the user's Favorited Songs playlist or something? How do I know which playlist that is?
I know isFavorited isn't a thing, just using it here so you can see what my intension is:
HStack(spacing: 10) {
Image(systemName: song.isFavorited ? "star.fill" : "star")
.foregroundColor(song.isFavorited ? .yellow : .gray)
Image(systemName: "magnifyingglass")
}
I am experiencing an issue where my Mac's speakers will crackle and pop when running an app on the Simulator or even when previewing SwiftUI with Live Preview.
I am using a 16" MacBook Pro (i9) and I'm running Xcode 12.2 on Big Sur (11.0.1).
Killing coreaudiod temporarily fixes the problem however this is not much of a solution.
Is anyone else having this problem?
Hi everyone,
I’m working on an iOS MusicKit app that overlays a metronome on top of Apple Music playback. To line the clicks up perfectly I’d like access to low-level audio analysis data—ideally a waveform / spectrogram or beat grid—while the track is playing.
I’ve noticed that several approved DJ apps (e.g. djay, Serato, rekordbox) can already:
• Display detailed scrolling waveforms of Apple Music songs
• Scratch, loop or time-stretch those tracks in real time
That implies they receive decoded PCM frames or at least high-resolution analysis data from Apple Music under a special entitlement.
My questions:
1. Does MusicKit (or any public framework) expose real-time audio buffers, FFT bins, or beat markers for streaming Apple Music content?
2. If not, is there an Apple program or entitlement that developers can apply for—similar to the “DJ with Apple Music” initiative—to gain that deeper access?
3. Where can I find official documentation or a point of contact for this kind of request?
I’ve searched the docs and forums but only see standard MusicKit playback APIs, which don’t appear to expose raw audio for DRM-protected songs. Any guidance, links or insider tips on the proper application process would be hugely appreciated!
Thanks in advance.
Hi everyone,
I’m working on an iOS MusicKit app that overlays a metronome on top of Apple Music playback. To line the clicks up perfectly I’d like access to low-level audio analysis data—ideally a waveform / spectrogram or beat grid—while the track is playing.
I’ve noticed that several approved DJ apps (e.g. djay, Serato, rekordbox) can already: • Display detailed scrolling waveforms of Apple Music songs • Scratch, loop or time-stretch those tracks in real time
That implies they receive decoded PCM frames or at least high-resolution analysis data from Apple Music under a special entitlement.
My questions: 1. Does MusicKit (or any public framework) expose real-time audio buffers, FFT bins, or beat markers for streaming Apple Music content? 2. If not, is there an Apple program or entitlement that developers can apply for—similar to the “DJ with Apple Music” initiative—to gain that deeper access? 3. Where can I find official documentation or a point of contact for this kind of request?
I’ve searched the docs and forums but only see standard MusicKit playback APIs, which don’t appear to expose raw audio for DRM-protected songs. Any guidance, links or insider tips on the proper application process would be hugely appreciated!
Thanks in advance.
Hi,
I'm trying to setup a AVAudioEngine for USB Audio recording and monitoring playthrough.
As soon as I try to setup playthough I get an error in the console: AVAEInternal.h:83 required condition is false: [AVAudioEngineGraph.mm:1361:Initialize: (IsFormatSampleRateAndChannelCountValid(outputHWFormat))]
Any ideas how to fix it?
// Input-Device setzen
try? setupInputDevice(deviceID: inputDevice)
let input = audioEngine.inputNode
// Stereo-Format erzwingen
let inputHWFormat = input.inputFormat(forBus: 0)
let stereoFormat = AVAudioFormat(commonFormat: inputHWFormat.commonFormat, sampleRate: inputHWFormat.sampleRate, channels: 2, interleaved: inputHWFormat.isInterleaved)
guard let format = stereoFormat else {
throw AudioError.deviceSetupFailed(-1)
}
print("Input format: \(inputHWFormat)")
print("Forced stereo format: \(format)")
audioEngine.attach(monitorMixer)
audioEngine.connect(input, to: monitorMixer, format: format)
// MonitorMixer -> MainMixer (Output)
// Problem here, format: format also breaks.
audioEngine.connect(monitorMixer, to: audioEngine.mainMixerNode, format: nil)
Hello,
I'm working on a Flutter app targeting both Android and iOS, where I implemented ShazamKit.
In order to achieve that, I first tried with the flutter_shazam_kit package, but since it's not maintained anymore, I forked it here, and tried to update it to meet the Google Play Store requirements, as you can see here:
https://github.com/mregnauld/flutter_shazam_kit/tree/fix-16k
Unfortunately, after trying everything, my app still doesn't meet the (not so) new 16 KB native library alignment. Also, I'm 100% sure it comes from that because the error message disappears if I remove that package from my app.
So after investigating, it seems that the problem comes from the ShazamKit for Android (that you can find here: https://developer.apple.com/download/all/?q=Android%20ShazamKit), and especially the .so files in the .aar file.
Is there anything I can do to fix that, or should I wait before the ShazamKit team fix that?
I'm totally stuck with that so any help is highly appreciated.
Thanks.
Hi all,
I have been quite stumped on this behavior for a little bit now, so thought it best to share here and see if someone more experience with AVAudioEngine / AVAudioSession can weigh in.
Right now I have a AVAudioEngine that I am using to perform some voice chat with and give buffers to play. This works perfectly until route changes start to occur, which causes the AVAudioEngine to reset itself, which then causes all players attached to this engine to be stopped.
Once a AVPlayerNode gets stopped due to this (but also any other time), all samples that were scheduled to be played then get purged. Where this becomes confusing for me is the completion handler gets called every time regardless of the sound actually being played.
Is there a reliable way to know if a sample needs to be rescheduled after a player has been reset?
I am not quite sure in my case what my observer of AVAudioEngineConfigurationChange needs to be doing, as this engine only handles output. All input is through a separate engine for simplicity.
Currently I am storing a queue of samples as they get sent to the AVPlayerNode for playback, and after that completion checking if the player isPlaying or not. If it's playing I assume that the sound actually was played- and if not then I leave it in the queue and assume that an observer on the route change or the configuration change will realize there are samples in the queue and reset them
Thanks for any feedback!
It appears iOS only comes with low quality voices installed.
iOS requires the user to go into settings to download higher quality voices to be used with AVSpeechUtterance.
There doesn't seem to be any api that can be used to make this process easier for the app user.
Is there a way / api that would allow an app to download and use a higher quality voice?
Will apple ever install on default higher quality voices?
We really want to use the text to speech api in iOS however the very high amount of user friction to use high quality voices is stopping us. I would appreciate a response.
Thanks
I'm getting hundreds of the message below in Xcode. I've narrowed it down to when I instantiate the following
AVAudioUnitComponentManager.shared()
Message send exceeds rate-limit threshold and will be dropped. { reporterID=231700600717315, rateLimit=32hz }
I’m developing a VoIP app that uses Linphone and CallKit. Everything works as expected until the user enables the speaker on the native CallKit screen. After that, all subsequent calls start with the speaker already on. Even if I call AVAudioSession.sharedInstance().overrideOutputAudioPort(.none), it gets overridden when the call starts (when Linphone begins playing the ringtone). I tested this behavior in WhatsApp, and it seems to work correctly there.
I’m using the shared instance of AVAudioSession. After activating it with .setActive(true), I observe the outputVolume, and it correctly reports the device’s volume.
However, after deactivating the session using .setActive(false), changing the volume, and then reactivating it again, the outputVolume returns the previous volume (before deactivation), not the current device volume. The correct volume is only reported after the user manually changes it again using physical buttons or Control Center, which triggers the observer.
What I need is a way to retrieve the actual current device volume immediately after reactivating the audio session, even on the second and subsequent activations.
Disabling and re-enabling the audio session is essential to how my application functions.
I’ve tested this behavior with my colleagues, and the issue is consistently reproducible on iOS 18.0.1, iOS 18.1, iOS 18.3, iOS 18.5 and iOS 18.6.2. On devices running iOS 17.6.1 and iOS 16.0.3, outputVolume correctly reflects the current volume immediately after calling .setActive(true) multiple times.
Hello, I want to know if there are any restrictions with MusicKit to be used in a mobile app to be able to manipulate audio with an EQ on tracks coming from Apple Music, without modifying the actual track structure/data of course, just the audio output.
I am trying to debug the AAX version of my plugin (MIDI effect) on Pro Tools, but I am getting the following error (Mac console) when attempting to load it:
dlsym cannot find symbol g_dwILResult in CFBundle etc..
I used Xcode 16.4 to build the plugin.
Has anybody come across the same or a similar message?
Best,
Achillefs
Axart Labs