Sequoia 15.4.1 (24E263)
XCode: 16.3 (16E140)
Logic Pro: 11.2.1
I’ve been developing a complex audio unit for Mac OS that works perfectly well in its own bespoke host app and is now well into its beta testing stage.
It did take some effort to get it to work well in Logic Pro however and all was fine and working well until:
The AU part is an empty app extension with a framework containing its code.
The framework contains Swift code for the UI and C code for the DSP parts.
When the framework is compiled using the Swift 5 compiler the AU will run in Logic with no problems.
(I should also mention that AU passes the most strict auval tests).
But… when the framework is compiled with Swift 6 Logic Pro cannot load it.
Logic displays a message saying the audio unit could not be loaded and to contact the developer.
My own host app loads the AU perfectly well with the Swift 6 version, so I know there’s nothing wrong with the audio unit.
I cannot find any differences in any of the built output files except, of course, the actual binary code in the framework.
I’ve worked for hours on this and cannot find a solution other than to build the framework in Swift 5.
(I worked hard to get all the async code updated and working with Swift 6! so I feel a little cheated!)
What is happening?
Is this a bug in Logic?
Is this a bug in Swift 6 compiler/linker?
I’m at the Duh! hands in the air, tearing out hair stage! ( once again!)
Audio
RSS for tagDive into the technical aspects of audio on your device, including codecs, format support, and customization options.
Selecting any option will automatically load the page
Post
Replies
Boosts
Views
Activity
It sounds simple but searching for the name "Favorite Songs" is a non-starter because it's called different names in different countries, even if I specify "&l=en_us" on the query.
So is there another property, relationship or combination thereof which I can use to tell me when I've found the right playlist?
Properties I've looked at so far:
canEdit: will always be false so narrows things down a little
inFavorites: not helpful as it depends on whether the user has favourite the favourites playlist, so not relevant
hasCatalog: seems always true so again may narrow things down a bit
isPublic: doesn't help
Adding the catalog relationship doesn't seem to show anything immediately useful either.
Can anyone help?
Ideally I'd like to see this as a "kind" or "type" as it has different properties to other playlists, but frankly I'll take anything at this point.
When multiple identical songs are added to a playlist, Playlist.Entry.id uses a suffix-based identifier (e.g. songID_0, songID_1, etc.). Removing one entry causes others to shift, changing their .id values. This leads to diffing errors and collection view crashes in SwiftUI or UIKit when entries are updated.
Steps to Reproduce:
Add the same song to a playlist multiple times.
Observe .id.rawValue of entries (e.g. i.SONGID_0, i.SONGID_1).
Remove one entry.
Fetch playlist again — note the other IDs have shifted.
FB18879062
Hello,
I have an existing AUv3 instrument plugin. In the plug in, users can access files (audio files, song projects) via a UIDocumentPickerViewController
In Logic Pro, (and some other hosts, but not all), the document picker is unable to receive touches, while a keyboard case is attached to the iPad.
Removing the case (this is an Apple brand iPad case) allows the interactions to resume and allows me to pick files in the usual way.
One of my users reports this non-responsive behavior occurs even after disconnecting their keyboard.
I have fiddled with entitlements all day, and have determined that is not the issue, since the keyboard disconnection appears to fix it every time for me.
Here is my, very boilerplate, presentation code :
guard let type = UTType("com.my.type") else {
return
}
let fileBrowser = UIDocumentPickerViewController(forOpeningContentTypes: [type])
fileBrowser.overrideUserInterfaceStyle = .dark
fileBrowser.delegate = self
fileBrowser.directoryURL = myFileFolderURL()
self.present(fileBrowser, animated: true) {
I'm streaming mp3 audio data using URLSession/AudioFileStream/AVAudioConverter and getting occasional silent buffers and glitches (little bleeps and whoops as opposed to clicks). The issues are present in an offline test, so this isn't an issue of underruns.
Doing some buffering on the input coming from the URLSession (URLSessionDataTask) reduces the glitches/silent buffers to rather infrequent, but they do still happen occasionally.
var bufferedData = Data()
func parseBytes(data: Data) {
bufferedData.append(data)
// XXX: this buffering reduces glitching
// to rather infrequent. But why?
if bufferedData.count > 32768 {
bufferedData.withUnsafeBytes { (bytes: UnsafeRawBufferPointer) in
guard let baseAddress = bytes.baseAddress else { return }
let result = AudioFileStreamParseBytes(audioStream!,
UInt32(bufferedData.count),
baseAddress,
[])
if result != noErr {
print("❌ error parsing stream: \(result)")
}
}
bufferedData = Data()
}
}
No errors are returned by AudioFileStream or AVAudioConverter.
func handlePackets(data: Data,
packetDescriptions: [AudioStreamPacketDescription]) {
guard let audioConverter else {
return
}
var maxPacketSize: UInt32 = 0
for packetDescription in packetDescriptions {
maxPacketSize = max(maxPacketSize, packetDescription.mDataByteSize)
if packetDescription.mDataByteSize == 0 {
print("EMPTY PACKET")
}
if Int(packetDescription.mStartOffset) + Int(packetDescription.mDataByteSize) > data.count {
print("❌ Invalid packet: offset \(packetDescription.mStartOffset) + size \(packetDescription.mDataByteSize) > data.count \(data.count)")
}
}
let bufferIn = AVAudioCompressedBuffer(format: inFormat!, packetCapacity: AVAudioPacketCount(packetDescriptions.count), maximumPacketSize: Int(maxPacketSize))
bufferIn.byteLength = UInt32(data.count)
for i in 0 ..< Int(packetDescriptions.count) {
bufferIn.packetDescriptions![i] = packetDescriptions[i]
}
bufferIn.packetCount = AVAudioPacketCount(packetDescriptions.count)
_ = data.withUnsafeBytes { ptr in
memcpy(bufferIn.data, ptr.baseAddress, data.count)
}
if verbose {
print("handlePackets: \(data.count) bytes")
}
// Setup input provider closure
var inputProvided = false
let inputBlock: AVAudioConverterInputBlock = { packetCount, statusPtr in
if !inputProvided {
inputProvided = true
statusPtr.pointee = .haveData
return bufferIn
} else {
statusPtr.pointee = .noDataNow
return nil
}
}
// Loop until converter runs dry or is done
while true {
let bufferOut = AVAudioPCMBuffer(pcmFormat: outFormat, frameCapacity: 4096)!
bufferOut.frameLength = 0
var error: NSError?
let status = audioConverter.convert(to: bufferOut, error: &error, withInputFrom: inputBlock)
switch status {
case .haveData:
if verbose {
print("✅ convert returned haveData: \(bufferOut.frameLength) frames")
}
if bufferOut.frameLength > 0 {
if bufferOut.isSilent {
print("(haveData) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)")
}
outBuffers.append(bufferOut)
totalFrames += Int(bufferOut.frameLength)
}
case .inputRanDry:
if verbose {
print("🔁 convert returned inputRanDry: \(bufferOut.frameLength) frames")
}
if bufferOut.frameLength > 0 {
if bufferOut.isSilent {
print("(inputRanDry) SILENT BUFFER at frame \(totalFrames), pending: \(pendingFrames), inputPackets=\(bufferIn.packetCount), outputFrames=\(bufferOut.frameLength)")
}
outBuffers.append(bufferOut)
totalFrames += Int(bufferOut.frameLength)
}
return // wait for next handlePackets
case .endOfStream:
if verbose {
print("✅ convert returned endOfStream")
}
return
case .error:
if verbose {
print("❌ convert returned error")
}
if let error = error {
print("error converting: \(error.localizedDescription)")
}
return
@unknown default:
fatalError()
}
}
}
I bought two "Apple USB-C to Headphone Jack Adapters". Upon closer inspection, they seems to be of different generations:
The one with product ID 0x110a on top is working fine. The one with product ID 0x110b has two issues:
There is a short but loud click noise on the headphone when I connect it to the iPad.
When I play audio using AVAudioPlayer the first half of a second or so is cut off.
Here's how I'm playing the audio:
audioPlayer = try AVAudioPlayer(contentsOf: url)
audioPlayer?.delegate = self
audioPlayer?.prepareToPlay()
audioPlayer?.play()
Is this a known issue? Am I doing something wrong?
I've got a setup using AVAudioEngine with several tone generator nodes, each with a chain of processing nodes, the chains then mixed into the main output.
Generator ➡️ Effect ➡️... ➡️ .mainMixerNode ➡️ .outputNode).
Generator ➡️ Effect ➡️... ⤴️
...
Generator ➡️ Effect ➡️... ⤴️
The user should be able to mute any chain individually. I've found several potential approaches to muting, but not terribly happy with any of them.
Adjust the amplitudes directly in my tone generators. Issue: Consumes CPU even when completely muted. 4 generators adds ~15% cpu, even when all chains are muted.
Detach/attach chains that are muted/unmuted. Issue: Causes loud clicking/popping sounds whenever muted/unmuted.
Fade mixer output volume while detaching/attaching a chain (just cutting the volume immediately to 0 doesn't get rid of the clicking/popping). Issue: Causes all channels to fade during the transition, so not ideal.
The rest of these ideas are variations on making volume control+detatch/attach work for individual chains, since approach #3 worked well.
Add an AVAudioMixer to the end of each chain (just for volume control). Issue: Only the mixer on the final chain functions -- the others block all output. Not sure what's going on there.
Use matrix mixer (for multi-input volume control). Plus detach/attach to reduce CPU if necessary. Not yet attempted, due to perceived complexity and reports of fragility in order of wiring in. A bunch of effort before I even know if it's going to work.
Develop my own fader node to put on the end of each channel. Unlike the tone generator (simple AVSourceNode), developing an effect node seems complex and time consuming. Might not even fix CPU use.
I'm not completely averse to the learning curve of either 5 or 6, but would rather get some guidance on best approach before diving in. They both seem likely to take more effort than I'd like for the simple behavior I'm trying to achieve.
I have sent in a feedback report (FB18222398) but I have no idea if anyone has looked at it. I know from past experiences that Apple devs do look at these forums.
This applies to each of the betas, 1, 2 and 3. I have created a new Personal Voice with each beta. I create a personal voice in English. When it's done processing, I tap Preview and it says in English what is expected. But after some time, an hour or a day, the language of the voice file changes languages and no longer works properly. If I press Preview it is no longer intelligible. I have a text to speech app and initially the created voice works but then when the language of the file changes, it no longer works. I have run an app on my iphone through Xcode that prints to the console the voices installed on the device with the language. Currently this is the voice file:
Voice Identifier: com.apple.speech.personalvoice.AAA9C6F2-9125-475F-BA2F-22C63274991D
Language: es-MX
and on a second device the same personal voice is in a different language:
Voice Identifier: com.apple.speech.personalvoice.AAA9C6F2-9125-475F-BA2F-22C63274991D
Language: zh-CN
Although, a previous personal voice file that listed as Spanish-Mexican played in English with a Spanish accent or when playing Spanish text, it sounded almost perfect. This current personal voice doesn't do that, and is unintelligible. Previous attempts have converted to Chinese.
I hope someone can look into this.
Hi all,
I'm working on an audio visualizer app that plays files from the user's music library utilizing MediaPlayer and AVAudioEngine. I'm working on getting the music library functionality working before the visualizer aspect.
After setting up the engine for file playback, my app inexplicably crashes with an EXC_BREAKPOINT with code = 1. Usually this means I'm unwrapping a nil value, but I think I'm handling the optionals correctly with guard statements. I'm not able to pinpoint where it's crashing. I think it's either in the play function or the setupAudioEngine function. I removed the processAudioBuffer function and my code still crashes the same way, so it's not that. The device that I'm testing this on is running iOS 26 beta 3, although my app is designed for iOS 18 and above.
After commenting out code, it seems that the app crashes at the scheduleFile call in the play function, but I'm not fully sure.
Here is the setupAudioEngine function:
private func setupAudioEngine() {
do {
try AVAudioSession.sharedInstance().setCategory(.playback, mode: .default)
try AVAudioSession.sharedInstance().setActive(true)
} catch {
print("Audio session error: \(error)")
}
engine.attach(playerNode)
engine.attach(analyzer)
engine.connect(playerNode, to: analyzer, format: nil)
engine.connect(analyzer, to: engine.mainMixerNode, format: nil)
analyzer.installTap(onBus: 0, bufferSize: 1024, format: nil) { [weak self] buffer, _ in
self?.processAudioBuffer(buffer)
}
}
Here is the play function:
func play(_ mediaItem: MPMediaItem) {
guard let assetURL = mediaItem.assetURL else {
print("No asset URL for media item")
return
}
stop()
do {
audioFile = try AVAudioFile(forReading: assetURL)
guard let audioFile else {
print("Failed to create audio file")
return
}
duration = Double(audioFile.length) / audioFile.fileFormat.sampleRate
if !engine.isRunning {
try engine.start()
}
playerNode.scheduleFile(audioFile, at: nil)
playerNode.play()
DispatchQueue.main.async { [weak self] in
self?.isPlaying = true
self?.startDisplayLink()
}
} catch {
print("Error playing audio: \(error)")
DispatchQueue.main.async { [weak self] in
self?.isPlaying = false
self?.stopDisplayLink()
}
}
}
Here is a link to my test project if you want to try it out for yourself:
https://github.com/aabagdi/VisualMan-example
Thanks!
I developed an educational app that implements audio-video communication through RTC, while using WebView to display course materials during classes. However, some users are experiencing an issue where the audio playback from WebView is very quiet. I've checked that the AVAudioSessionCategory is set by RTC to AVAudioSessionCategoryPlayAndRecord, and the AVAudioSessionCategoryOption also includes AVAudioSessionCategoryOptionMixWithOthers. What could be causing the WebView audio to be suppressed, and how can this be resolved?
I’m developing a macOS audio monitoring app using AVAudioEngine, and I’ve run into a critical issue on macOS 26 beta where AVFoundation fails to detect any input devices, and AVAudioEngine.start() throws the familiar error 10877.
FB#: FB19024508
Strange Behavior:
AVAudioEngine.inputNode shows no channels or input format on bus 0.
AVAudioEngine.start() fails with -10877 (AudioUnit connection error).
AVCaptureDevice.DiscoverySession returns zero audio devices.
Microphone permission is granted (authorized), and the app is properly signed and sandboxed with com.apple.security.device.audio-input.
However, CoreAudio HAL does detect all input/output devices:
Using AudioObjectGetPropertyDataSize and AudioObjectGetPropertyData with kAudioHardwarePropertyDevices, I can enumerate 14+ devices, including AirPods, USB DACs, and BlackHole.
This suggests the lower-level audio stack is functional.
I have tried:
Resetting CoreAudio with sudo killall coreaudiod
Rebuilding and re-signing the app
Clearing TCC with tccutil reset Microphone
Running on Apple Silicon and testing Rosetta/native detection via sysctl.proc_translated
Using a fallback mechanism that logs device info from HAL and rotates logs for submission via Feedback Assistant
I have submitted logs and a reproducible test case via Feedback Assitant : FB#: FB19024508]
Dear Sirs,
I'd like to add an icon to my audio driver based on AudioDriverKit. This icon should show up left of my audio device in the audio devices dialog. For an Audio Server Plugin I managed to do this using the property kAudioDevicePropertyIcon and CFBundleCopyResourceURL(...) but how would you do this with AudioDriverKit? Should I use IOUserAudioCustomProperty or IOUserAudioControl and how would I refer to the Bundle? Is there an example available somewhere?
Thanks and best regards,
Johannes
i tried combine speech detector and speech transciber to anlayzer.
but speech detector is not speech module. please help me
I'm working in Swift/SwiftUI, running XCode 16.3 on macOS 15.4 and I've seen this when running in the iOS simulator and in a macOS app run from XCode. I've also seen this behaviour with 3 different audio files.
Nothing in the documentation says that the speechRecognitionMetadata property on an SFSpeechRecognitionResult will be nil until isFinal, but that's the behaviour I'm seeing.
I've stripped my class down to the following:
private var isAuthed = false
// I call this in a .task {} in my SwiftUI View
public func requestSpeechRecognizerPermission() {
SFSpeechRecognizer.requestAuthorization { authStatus in
Task {
self.isAuthed = authStatus == .authorized
}
}
}
public func transcribe(from url: URL) {
guard isAuthed else { return }
let locale = Locale(identifier: "en-US")
let recognizer = SFSpeechRecognizer(locale: locale)
let recognitionRequest = SFSpeechURLRecognitionRequest(url: url)
// the behaviour occurs whether I set this to true or not, I recently set
// it to true to see if it made a difference
recognizer?.supportsOnDeviceRecognition = true
recognitionRequest.shouldReportPartialResults = true
recognitionRequest.addsPunctuation = true
recognizer?.recognitionTask(with: recognitionRequest) { (result, error) in
guard result != nil else { return }
if result!.isFinal {
//speechRecognitionMetadata is not nil
} else {
//speechRecognitionMetadata is nil
}
}
}
}
Further, and this isn't documented either, the SFTranscriptionSegment values don't have correct timestamp and duration values until isFinal. The values aren't all zero, but they don't align with the timing in the audio and they change to accurate values when isFinal is true.
The transcription otherwise "works", in that I get transcription text before isFinal and if I wait for isFinal the segments are correct and speechRecognitionMetadata is filled with values.
The context here is I'm trying to generate a transcription that I can then highlight the spoken sections of as audio plays and I'm thinking I must be just trying to use the Speech framework in a way it does not work. I got my concept working if I pre-process the audio (i.e. run it through until isFinal and save the results I need to json), but being able to do even a rougher version of it 'on the fly' - which requires segments to have the right timestamp/duration before isFinal - is perhaps impossible?
Hello Apple Developer Community,
I am seeking clarification on the intended display behavior of HLS audio tracks within the iOS 26 (or current beta) native player, specifically concerning the NAME and LANGUAGE attributes of the EXT-X-MEDIA tag.
In our HLS manifests, we define alternative audio tracks using EXT-X-MEDIA tags, like so:
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-1",DEFAULT=YES,AUTOSELECT=YES,URI="audio_ja.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",LANGUAGE="ja",NAME="AUDIO-2",URI="audio_en.m3u8"
Our observation is that when an audio track is selected and its name is displayed in the native iOS media controls (e.g., Control Center or within a full-screen video player's UI), the value specified in the NAME attribute ("AUDIO-1", "AUDIO-2") does not seem to be used. Instead, the display appears to derive from the LANGUAGE attribute ("ja", "en"), often showing the system's localized string for that language (e.g., "Japanese", "English").
We would like to understand the official or intended behavior regarding this.
Is it the expected behavior for the iOS native player to prioritize the LANGUAGE attribute (or its localized equivalent) over the NAME attribute for displaying the selected audio track's label?
If this is the intended design, what is the recommended best practice for developers who wish to present a custom, human-readable name for audio tracks (beyond the standard language name) in the native iOS UI?
Are there any specific AVPlayer properties or AVMediaSelectionOption considerations that would allow more granular control over this display, or is this entirely managed by the system based on the LANGUAGE attribute?
Any insights or official guidance on this behavior in iOS 26 (and potentially previous versions) would be greatly appreciated.
Thank you for your time and assistance.
AVAudioFormat has no Swift concurrency annotations but the documentation states "Instances of this class are immutable."
This made me always assume it was safe to pass AVAudioFormat instances around. Is this the case? If so can it be marked as Sendable? Am I missing something?
I ran 5.1 audio tests in both YouTube and Apple Music, and I noticed that when sound is supposed to play from the rear or front surround speakers, it’s also duplicated in the front left and right channels. I’m absolutely sure the issue is with the Apple TV, because I played the same video directly through my TV’s native system, and the channel separation was correct.
Everything used to work perfectly before, so this must be a software issue. I’m currently on tvOS 26 Developer Beta 5, but I’m certain the problem also existed on the stable tvOS 18.5.
I’ve already reset and updated my Apple TV, and I also tried switching the audio format to forced Dolby Atmos 5.1. On the forums, I mostly see complaints about Dolby Atmos not working at all — in my case, everything technically works, but not the way it’s supposed to.
Topic:
Media Technologies
SubTopic:
Audio
Since iOS 18, the system setting “Allow Audio Playback” (enabled by default) allows third-party app audio to continue playing while the user is recording video with the Camera app. This has created a problem for the app I’m developing.
➡️ The problem:
My app plays continuous audio in both foreground and background states. If the user starts recording video using the iOS Camera app, the app’s audio — still playing in the background — gets captured in the video — obviously an unintended behavior.
Yes, the user could stop the app manually before starting the video recording, but that can’t be guaranteed. As a developer, I need a way to stop the app’s audio before the video recording begins.
So far, I haven’t found a reliable way to detect when video recording starts if ‘Allow Audio Playback’ is ON.
➡️ What I’ve tried:
— AVAudioSession.interruptionNotification → doesn’t fire
— devicesChangedEventStream → not triggered
I don’t want to request mic permission (app doesn’t use mic). also, disabling the app from playing audio in the background isn’t an option as it is a crucial part of the user experience
➡️ What I need:
A reliable, supported way to detect when the Camera app begins video recording, without requiring mic access — so I can stop audio and avoid unintentional overlap with the user’s recordings.
Any official guidance, workarounds, or AVFoundation techniques would be greatly appreciated.
Thanks.
Hi team,
With regards to Call (Live) Translations on VOIP:
Is it possible to invoke live translations within the app? (without going into the Call System UI)
Is it possible to navigate users from app to Call System UI via an API? (and also invoking the new live translations directly)
Will Apple support more languages apart from the current ones? (Currently I see 4 supported languages)
I started playing which transcription of audio files on macOS today, latest beta of Xcode and latest beta of Tahoe. Transcription itself works really well, but for some reason the majority of the results contain no audioTimeRange. I got 22 single-word results with time ranges, spread out all over total file of 53 minutes.
Is there something I can do to improve this? To my understanding, I have followed sample code and instructions very closely, but the SwiftTranscriptionSampleApp and other examples I've seen lead me to believe I should be getting a lot more time ranges than I actually do.