I receive a buffer from[AVSpeechSynthesizer convertToBuffer:fromBuffer:] and want to schedule it on an AVPlayerNode.
The player node's output format need to be something that the next node could handle and as far as I understand most nodes can handle a canonical format.
The format provided by AVSpeechSynthesizer is not something thatAVAudioMixerNode supports.
So the following:
AVAudioEngine *engine = [[AVAudioEngine alloc] init];
playerNode = [[AVAudioPlayerNode alloc] init];
AVAudioFormat *format = [[AVAudioFormat alloc]
initWithSettings:utterance.voice.audioFileSettings];
[engine attachNode:self.playerNode];
[engine connect:self.playerNode to:engine.mainMixerNode format:format];
Throws an exception:
Thread 1: "[[busArray objectAtIndexedSubscript:(NSUInteger)element] setFormat:format error:&nsErr]: returned false, error Error Domain=NSOSStatusErrorDomain Code=-10868 \"(null)\""
I am looking for a way to obtain the canonical format for the platform so that I can use AVAudioConverter to convert the buffer.
Since different platforms have different canonical formats, I imagine there should be some library way of doing this. Otherwise each developer will have to redefine it for each platform the code will run on (OSX, iOS etc) and keep it updated when it changes.
I could not find any constant or function which can make such format, ASDB or settings.
The smartest way I could think of, which does not work:
AudioStreamBasicDescription toDesc;
FillOutASBDForLPCM(toDesc, [AVAudioSession sharedInstance].sampleRate,
2, 16, 16, kAudioFormatFlagIsFloat, kAudioFormatFlagsNativeEndian);
AVAudioFormat *toFormat = [[AVAudioFormat alloc] initWithStreamDescription:&toDesc];
Even the provided example for iPhone, in the documentation linked above, uses kAudioFormatFlagsAudioUnitCanonical and AudioUnitSampleType which are deprecated.
So what is the correct way to do this?
AVAudioEngine
RSS for tagUse a group of connected audio node objects to generate and process audio signals and perform audio input and output.
Posts under AVAudioEngine tag
43 Posts
Sort by:
Post
Replies
Boosts
Views
Activity
I’m developing a voice communication app for the iPad with both playback and record and using AudioUnit of type kAudioUnitSubType_VoiceProcessingIO to have echo cancellation.
When playing the audio before initializing the recording audio unit, volume is high. But if I'm playing the audio after initializing the audio unit or when switching to remoteio and then back to vpio the playback volume is low.
It seems like a bug in iOS, any solution or workaround for this? Searching the net I only found this post without any solution: https://developer.apple.com/forums/thread/671836
I work on a video conferencing application, which makes use of AVAudioEngine and the videoChat AVAudioSession.Mode
This past Friday, an internal user reported an "audio cutting in and out" issue with their new iPhone 14 Pro, and I was able to reproduce the issue later that day on my iPhone 14 Pro Max. No other iOS devices running iOS 16 are exhibiting this issue.
I have narrowed down the root cause to the videoChat AVAudioSession.Mode after changing line 53 of the ViewController.swift file in Apple's "Using Voice Processing" sample project (https://developer.apple.com/documentation/avfaudio/audio_engine/audio_units/using_voice_processing) from:
try session.setCategory(.playAndRecord, options: .defaultToSpeaker)
to
try session.setCategory(.playAndRecord, mode: .videoChat, options: .defaultToSpeaker)
This only causes issues on my iPhone 14 Pro Max device, not on my iPhone 13 Pro Max, so it seems specific to the new iPhones only.
I am also seeing the following logged to the console using either device, which appears to be specific to iOS 16, but am not sure if it is related to the videoChat issue or not:
2022-09-19 08:23:20.087578-0700 AVEchoTouch[2388:1474002] [as] ATAudioSessionPropertyManager.mm:71 Invalid input size for property 1684431725
2022-09-19 08:23:20.087605-0700 AVEchoTouch[2388:1474002] [as] ATAudioSessionPropertyManager.mm:225 Invalid input size for property 1684431725
I am assuming 1684431725 is 'dfcm' but I am not sure what Audio Session Property that might be.
I am using AVSpeechSynthesizer to get audio buffer and play,
I am using AVAudioEngine and AVAudioPlayerNode to play the buffer.
But I am getting error.
[avae] AVAEInternal.h:76 required condition is false: [AVAudioPlayerNode.mm:734:ScheduleBuffer: (_outputFormat.channelCount == buffer.format.channelCount)]
2023-05-02 03:14:35.709020-0700 AudioPlayer[12525:308940] *** Terminating app due to uncaught exception 'com.apple.coreaudio.avfaudio', reason: 'required condition is false: _outputFormat.channelCount == buffer.format.channelCount'
Can anyone please help me to play the AVAudioBuffer from AVSpeechSynthesizer write method?
Hi community
I'm developing an application for MacOS and i need to capture the mic audio stream. Currently using CoreAudio in Swift i'm able to capture the audio stream using IO Procs and have applied the AUVoiceProcessing for prevent echo from speaker device. I was able to connect the audio unit and perform the echo cancellation.
The problem that i'm getting is that when i'm using AUVoiceProcessing the gain of the two devices get reduced and that affects the volume of the two devices (microphone and speaker).
I have tried to disable the AGC using the property kAUVoiceIOProperty_VoiceProcessingEnableAGCbut the results are the same.
There is any option to disable the gain reduction or there is a better approach to get the echo cancellation working?
It is so frustrating and the second time this has happened to me. I found a fix for the first time, but can't seem to find one now. Help!
Anyone know what to do to stop the noise?
Lisa
Our app is a game written in Unity where we have most of our audio playback handled by Unity. However, one of our game experiences utilized microphone input for speech recognition, and so in order for us to perform echo cancellation (while the game has audio playback), we setup an audio stream from Unity to native Swift code that performs the mixing of the input/output nodes.
We however found that by streaming the audio buffer to our AVAudioSession:
The volume of the audio playback appears to output differently
When capturing a screen recording of the app, the audio playback being played from AVAudioSession does not get captured at all.
Looking to figure out what could be causing the discrepency in playback as well as capture behaviour during screen recordings.
We setup the AVAudioSession with this configuration:
AVAudioSession.sharedInstance().setCategory(AVAudioSession.Category.playAndRecord, options: .mixWithOthers)
with
inputNode.setVoiceProcessingEnabled(true)
after connecting our IO and mixer nodes.
Any suggestions or ideas on what to look out for would be appreciated!
Prior to iOS 17, I used AVAudioFile to open (for reading) the assetURL of MPMediaItem for songs that the user purchased through iTunes Store. With the iOS 17 Beta, this seems no longer possible as AVAudioFile throws this:
ExtAudioFile.cpp:211 about to throw -54: open audio file
AVAEInternal.h:109 [AVAudioFile.mm:135:AVAudioFileImpl: (ExtAudioFileOpenURL((CFURLRef)fileURL, &_extAudioFile)): error -54
Also can't copy the url to Documents directory because I get this:
The file “item.m4a” couldn’t be opened because URL type ipod-library isn’t supported.
This seems to be affecting other apps on the App Store besides mine, and it will reflect very badly on my app if this makes into the final iOS 17 because I have encouraged users to buy songs on iTunes Store to use with my app. Now there seems like there is no way to access them.
Is this a known bug, or is there some type of workaround?
When recording audio over bluetooth from AirPods to iPhone using the AVAudioRecorder the Bluetooth audio codec used is always AAC-ELD independent of the codec to store which is selected in the AVAudioRecorder instance.
As far as I know must every Bluetooth device support SBC, hence, it should be possible for the AirPods to transmit the recorded audio using the SBC codec instead of AAC-ELD. However, I could not find any resource on how the request this codec using the AVAudioRecorder or AVAudioEngine.
Is it possible to request SBC at all and if yes how?
Hi!
I am working on an audio application on iOS. This is how I retreive the workgroup from the remoteIO audiounit (ioUnit). The unit is initialized and is working fine (meaning that it is regularly called by the system).
os_workgroup_t os_workgroup{nullptr};
uint32_t os_workgroup_index_size;
if (status = AudioUnitGetProperty(ioUnit, kAudioOutputUnitProperty_OSWorkgroup, kAudioUnitScope_Global, 0,
&os_workgroup, &os_workgroup_index_size);
status != noErr)
{
throw runtime_error("AudioUnitSetProperty kAudioOutputUnitProperty_OSWorkgroup - Failed with OSStatus: " +
to_string(status));
}
However the resulting os_workgroup's value is 0x40. Which seems not correct. No wonder I cannot join any other realtime threads to the workgroup as well. The returned status however is a solid 0.
Can anyone help?
From an app that reads audio from the built-in microphone, I'm receiving many crash logs where the AVAudioEngine fails to start again after the app was suspended.
Basically, I'm calling these two methods in the app delegate's
applicationDidBecomeActive and
applicationDidEnterBackground
methods respectively:
let audioSession = AVAudioSession.sharedInstance()
func startAudio() throws {
self.audioEngine = AVAudioEngine()
try self.audioSession.setCategory(.record, mode: .measurement)}
try audioSession.setActive(true)
self.audioEngine!.inputNode.installTap(onBus: 0, bufferSize: 4096, format: nil, block: { ... })
self.audioEngine!.prepare()
try self.audioEngine!.start()
}
func stopAudio() throws {
self.audioEngine?.stop()
self.audioEngine?.inputNode.removeTap(onBus: 0)
self.audioEngine = nil
try self.audioSession.setActive(false, options: [.notifyOthersOnDeactivation])
}
In the crash logs (iOS 16.6) I'm seeing that this works fine several times as the app is opened and closed, but suddenly the audioEngine.start() call fails with the error
Error Domain=com.apple.coreaudio.avfaudio Code=-10851 "(null)" UserInfo={failed call=err = AUGraphParser::InitializeActiveNodesInInputChain(ThisGraph, *GetInputNode())}
and the audioEngine!.inputNode.outputFormat(forBus: 0) is something like
<AVAudioFormat 0x282301c70: 2 ch, 0 Hz, Float32, deinterleaved>
. Also, right before installing the tap, audioSession.availableInputs contains an entry of type MicrophoneBuiltIn but audioSession.currentRoute lists no inputs at all.
I was not able to reproduce this situation on my own devices yet.
Does anyone have an idea why this is happening?
My User Generated Content for my App is audio-based only and anonymous.
All the content is deleted after 24 hours. Do I still need a report button, since I don't know the user and the content gets deleted anyway?
Hello, I have struggled to resolve issue above question.
I could speak utterance when I turn on my iPhone, but when my iPhone goes to background mode(turn off iPhone), It doesn't speak any more.
I think it is possible to play audio or speak utterance because I can play music on background status in youtube.
Any help please??
Sound Pad throws an error on the audio device on iPad.
Sound Pad
Version 1.0.0
3 April 2023
Swift 5.8 Edition
A fatal error was found in AudioPlayer.swift
Line which causes this
var engine: AVAudioEngine
Hi there, I'm having some trouble with AVAudioMixerNode only working when there is a single input, and outputting silence or very quiet buzzing when >1 input node is connected. My setup has voice processing enabled, input going to a sink, and N source nodes going to the main mixer node, going to the output node. In all cases I am connecting nodes in the graph with the same declared format: 48kHz 1 channel Float32 PCM.
This is working great for 1 source node, but as soon as I add a second it breaks. I can reproduce this behaviour in the SignalGenerator sample, when the same format is used everywhere. Again, it'll work fine with 1 source node even in this configuration, but add another and there's silence.
Am I doing something wrong with formats here? Is this expected? As I understood it with voice processing on and use of a mixer node I should be able to use my own format essentially everywhere in my graph?
My SignalGenerator modified repro example follows:
import Foundation
import AVFoundation
// True replicates my real app's behaviour, which is broken.
// You can remove one source node connection
// to make it work even when this is true.
let showBrokenState: Bool = true
// SignalGenerator constants.
let frequency: Float = 440
let amplitude: Float = 0.5
let duration: Float = 5.0
let twoPi = 2 * Float.pi
let sine = { (phase: Float) -> Float in
return sin(phase)
}
let whiteNoise = { (phase: Float) -> Float in
return ((Float(arc4random_uniform(UINT32_MAX)) / Float(UINT32_MAX)) * 2 - 1)
}
// My "application" format.
let format: AVAudioFormat = .init(commonFormat: .pcmFormatFloat32,
sampleRate: 48000,
channels: 1,
interleaved: true)!
// Engine setup.
let engine = AVAudioEngine()
let mainMixer = engine.mainMixerNode
let output = engine.outputNode
try! output.setVoiceProcessingEnabled(true)
let outputFormat = engine.outputNode.inputFormat(forBus: 0)
let sampleRate = Float(format.sampleRate)
let inputFormat = format
var currentPhase: Float = 0
let phaseIncrement = (twoPi / sampleRate) * frequency
let srcNodeOne = AVAudioSourceNode { _, _, frameCount, audioBufferList -> OSStatus in
let ablPointer = UnsafeMutableAudioBufferListPointer(audioBufferList)
for frame in 0..<Int(frameCount) {
let value = sine(currentPhase) * amplitude
currentPhase += phaseIncrement
if currentPhase >= twoPi {
currentPhase -= twoPi
}
if currentPhase < 0.0 {
currentPhase += twoPi
}
for buffer in ablPointer {
let buf: UnsafeMutableBufferPointer<Float> = UnsafeMutableBufferPointer(buffer)
buf[frame] = value
}
}
return noErr
}
let srcNodeTwo = AVAudioSourceNode { _, _, frameCount, audioBufferList -> OSStatus in
let ablPointer = UnsafeMutableAudioBufferListPointer(audioBufferList)
for frame in 0..<Int(frameCount) {
let value = whiteNoise(currentPhase) * amplitude
currentPhase += phaseIncrement
if currentPhase >= twoPi {
currentPhase -= twoPi
}
if currentPhase < 0.0 {
currentPhase += twoPi
}
for buffer in ablPointer {
let buf: UnsafeMutableBufferPointer<Float> = UnsafeMutableBufferPointer(buffer)
buf[frame] = value
}
}
return noErr
}
engine.attach(srcNodeOne)
engine.attach(srcNodeTwo)
engine.connect(srcNodeOne, to: mainMixer, format: inputFormat)
engine.connect(srcNodeTwo, to: mainMixer, format: inputFormat)
engine.connect(mainMixer, to: output, format: showBrokenState ? inputFormat : outputFormat)
// Put the input node to a sink just to match the formats and make VP happy.
let sink: AVAudioSinkNode = .init { timestamp, numFrames, data in
.zero
}
engine.attach(sink)
engine.connect(engine.inputNode, to: sink, format: showBrokenState ? inputFormat : outputFormat)
mainMixer.outputVolume = 0.5
try! engine.start()
CFRunLoopRunInMode(.defaultMode, CFTimeInterval(duration), false)
engine.stop()
Hello,
I'm facing an issue with Xcode 15 and iOS 17: it seems impossible to get AVAudioEngine's audio input node to work on simulator.
inputNode has a 0ch, 0kHz input format,
connecting input node to any node or installing a tap on it fails systematically.
What we tested:
Everything works fine on iOS simulators <= 16.4, even with Xcode 15.
Nothing works on iOS simulator 17.0 on Xcode 15.
Everything works fine on iOS 17.0 device with Xcode 15.
More details on this here: https://github.com/Fesongs/InputNodeFormat
Any idea on this? Something I'm missing?
Thanks for your help 🙏
Tom
PS: I filed a bug on Feedback Assistant, but it usually takes ages to get any answer so I'm also trying here 😉
In voip application , when the CallKit is enabled if we try playing a video through AVplayer the video content is updated frame by frame and the audio of the content is not audible . This issue is observed only in iOS 17, any idea how can we resolve this
Hello,
I started to set audio stereo recording (both audio and video are recorded) and the audio quality seems to be lower than quality obtained with native camera application (configured for stereo).
Using console to check the log, I found a difference between camera app and mine regarding MXSessionMode (of mediaserverd)
in fact, camera application gives MXSessionMode = SpatialRecording and mine MXSessionMode = VideoRecording
How can I configure capture session to finally have MXSessionMode = SpatialRecording?
Any suggestion?
Best regards
Basically for this iPhone app I want to be able to record from either the built in microphone or from a connected USB audio device while simultaneously playing back processed audio on connected AirPods. It's a pretty simple AVAudioEngine setup that includes a couple of effects units. The category is set to .playAndRecord with the .allowBluetooth and .allowBluetoothA2DP options added. With no attempts to set the preferred input and AirPods connected, the AirPods mic will be used and output also goes to the AirPods. If I call setPreferredInput to either built in mic or a USB audio device I will get input as desired but then output will always go to the speaker. I don't really see a good explanation for this and overrideOutputAudioPort does not really seem to have suitable options.
Testing this on iPhone 14 Pro
How can I record audio in a keyboard extension? I've enabled microphone support by enabling "RequestsOpenAccess". When I try to record, I get the error below in the console. This doesn't make sense as Apple's docs seem to say that microphone access is allowed with Full Keyboard Access. What is the point of enabling the microphone if the app cannot access the data from the microphone?
-CMSUtilities- CMSUtility_IsAllowedToStartRecording: Client sid:0x2205e, XXXXX(17965), 'prim' with PID 17965 was NOT allowed to start recording because it is an extension and doesn't have entitlements to record audio.