Here is the demo from Apple's site
This issues is specific to iOS 18.
When running this demo, we are getting new text when we have a gap in speaking, the recognitionTask(with:resultHandler:) provides new text which is only spoken after the gap and not the concatenation of old text and the new spoken text.
AVAudioEngine
RSS for tagUse a group of connected audio node objects to generate and process audio signals and perform audio input and output.
Posts under AVAudioEngine tag
52 Posts
Sort by:
Post
Replies
Boosts
Views
Activity
Hi!
I have an AVAudioSequencer with some AVMusicTracks that are filled with AVParameterEvents.
If I toggle the isMuted property of a track, it will instantly mute when changed to true. However, after turning the muting to false, the events will only triggers on the next round of a loop and not instantly. Is this intended behaviour, and is there some way to get the events to trigger immediately after toggling the isMuted to be false?
AVAudioEngine and AVAudioSession
Welcome! I will start off with the terms AVAudioEngineImpl::Initialize(NSError**).
Why? I want to make those who run into this issue have to possibility to find this post through Search Engines!
This is short small breakdown based on what I observed while trying to use these two Components. It's not a guide that goes into all the details.
If you're trying to figure out how to fix a crash, you may can find a common way to fix it, in this post!
Is it possible to use AVAudioEngine and AVAudioSession together?
The answer is yes.
But you will face challenges regarding it. Mostly AVAudioEngine. Whatever you're trying to do, it will take a lot of testing. I don't know how it will be with an IDE. But with just .app and iPhone it will take some testing. Or a lot of testing.
Something that helped me fixing a crash was, this here: https://developer.apple.com/documentation/avfaudio/audio_engine/audio_units/using_voice_processing
This example Project by Apple, uses both AVAudioEngine and AVAudioSession.
How can I fix AVAudioEngineImpl::Initialize(NSError**) ?
I think this depends. If you're lucky and have a crash log, you may can find clues, but the stack trace sometimes doesn't really help either.
I will mention common cases that I encountered though.
inputNode
https://developer.apple.com/documentation/avfaudio/avaudioengine/1386063-inputnode
You need an inputNode apparently. You need to access it or else I think there won't be one. And if there isn't one, AVAudioEngine.start will most likely crash.
The audio engine creates a singleton on demand when first accessing this variable.
Doing this has prevented this common issue for me.
.prepare deallocates and can cause a crash if you restart your AudioEngine
Another issue I faced was handling .prepare wrong. You don't need .prepare. But if you use installTap or other things, I think you need it.
Here is a common thing to note.
If you had previous initialized inputNode. Those could be gone after using .prepare.
You have to ensure you're accessing AVAudioEngine.inputNode again before calling .start() or whatever node you need.
The Voice Processing Project, does this by creating a Managing Controller for AVAudioEngine with a sort of "setup" function, which ensures that everything is ready, before .prepare and .start get called.
AVAudioSession's setCategory
You have to experiment with it. The crashes can be very weird. Sometimes your App will only crash once, and then only after you install it again, or if you start it up.
You are actually able to use .setActive and .setCategory with AVAduioEngine. Just do not try to do .setActive(false) before you've stopped the AudioEngine, as it will fail.
Sometimes I'd run into an issue with .setActive(true) so you really have to experiment if leaving that part out resolves the issue or not.
try session.setCategory(.multiRoute, mode: .default, options: [.defaultToSpeaker, .mixWithOthers])
Experiment with it. But these .multiRoute and .mixWithOthers have allowed me to use AVAudioEngine to make a test recording. And I can even switch the Data Sources and Polar Patterns without any issues.
Sometimes you can get away without setting .setActive at all. Not sure if AVAudioEngine does it automatically.
Short Summary
If you use .prepare and then .stop, make sure to initialize things like .inputNode before calling .prepare and .start again. (THIS CAN BE DIFFERENT)
Only call .setActive(false) after you used .stop. Otherwise I believe it has no chance to stop it.
AVAudioSession setCategory is important. Ensure you use mixRoutes or experiment with all the modes.
If you manage to solve your crash, you'll be able to indeed change the Data Sources and Polar Patterns and more!
Use isRunning before using .start, this will save you from another crash. If you use .start while it's already running, I think try and catch won't save you here, you have to ensure you're not starting it twice.
I hope that this short breakdown will help you to resolve your crash. If you get deeper into AVAudioEngine and AVAudioSession, you'll probably face more crashes. I yet, need to figure out how to solve them. I have a lot of trouble to put my Testing App on my iPhone, so I am sorry if this guide didn't cover every detail of it.
A HUGE tip from me is to check the Documentations. As example, when I read the Documentation for inputNode I learned why my app crashed, it's because I never accessed and initialized one.
The Developer Documentation can be a little bit of a laberynth, and I strongly recommend you to read every property you try to access if you believe they cause issues. And I also recommend to find example Projects like the Voice Processing ones. As there aren't any Code Examples in the Documentation.
If I have bluetooth speaker connected and I have installTap called on input Node, the callback is fired for 1-2 seconds then it doesnt anymore. I dont see any route or any notification handler called in between.
engine.inputNode.removeTap(onBus: 0)
engine.inputNode.installTap(
onBus: 0,
bufferSize: 4096,
format: format
) { buffer, _ in
// 3
guard let channelData = buffer.floatChannelData else {
return
}
// This callback fails after some time.
}
Not sure if this is expected, but I noticed some other applications, they seem to work fine.
If I remove bluetooth device, my input works fine.
Also I have no issues with output on Speaker.
Calls to ExtAudioFileRead are throwing OSStatus 561145203 (AVAudioSessionErrorCodeResourceNotAvailable) on iOS and iPadOS 18 -- earlier versions of iOS have not exhibited this behavior. This is a longstanding code path that has seen a spike of these error codes since iOS 18's release.
The following is also printed to the Xcode 16 console:
Watch OS11 My recording play gets paused when watch I turned down.
It was not happening in previous versions.
In my app I recorded my recording.
And When I play it in my app,
it was playing good in debug mode(when Xcode is connected) could not debug.
Otherwise, it was automatically paused(when my wrist is down or inactivity time is elapsed)
I want it to be continued.
Hi everyone,
I’m working on a project that involves streaming audio over WebSockets, and I need to compress the audio to reduce bandwidth usage. I’m currently using AVAudioEngine to capture and process audio in PCM format (AVAudioPCMBuffer), but I want to compress the buffer into Opus (or another efficient codec) before sending it over the network.
Has anyone worked with compressing an AVAudioPCMBuffer into Opus format within a tap on the inputNode, or could you recommend the best approach for compressing the PCM buffer into a different format? I haven’t been able to find a working solution for this.
Any advice or code examples would be greatly appreciated!
Thanks in advance,
Ondřej
--
My current code without the compression:
inputNode.installTap(onBus: .zero, bufferSize: 1440, format: nil) { [weak self] buffer, time in
guard let self else {
return
}
// 1. Send data
// a) Convert the buffer into the desired format
if let outputBuffer = buffer.convert(toFormat: Self.websocketInputFormat) {
// b) Use the converted buffer
// TODO: compress it into a different format
if let data = outputBuffer.convertToData() {
self.sendAudio(data)
}
}
// 2. Get sound level
self.visualizeRecorderBuffer(buffer)
}
func convert(toFormat outputFormat: AVAudioFormat) -> AVAudioPCMBuffer? {
let outputFrameCapacity = AVAudioFrameCount(
round(Double(frameLength) * (outputFormat.sampleRate / format.sampleRate))
)
guard
let outputBuffer = AVAudioPCMBuffer(pcmFormat: outputFormat, frameCapacity: outputFrameCapacity),
let converter = AVAudioConverter(from: format, to: outputFormat)
else {
return nil
}
converter.convert(to: outputBuffer, error: nil) { packetCount, status in
status.pointee = .haveData
return self
}
return outputBuffer
}
static private let websocketInputFormat = AVAudioFormat(
commonFormat: .pcmFormatInt16,
sampleRate: 16000,
channels: 1,
interleaved: false
)!
I have a visionOS app that plays audio using AVAudioEngine and presents both a window and an immersive space. If I close the window, the audio session gets interrupted and attempting to restart the session and audio engine has no effect. I need to dismiss the app, then reopen it, which reopens the main window, in order for audio to start playing again.
This is in all visionOS 2 betas. Note that I have background audio enabled for my app.
Hi! I have a music app using AVAudioEngine. Right now, I have set it up to play multi channel tracks and show "Multichannel" in the volume controls. However, I am unable to figure out how to get it to use Dolby Atmos.
Is there something that needs to be enabled? Is it even possible for AVAudioEngine? I saw some apps that are able of playing with Dolby Atmos, but they do not have EQ feature, so I'm guessing that they are not using AVAudioEngine.
I'm using AVAudioEngine to play AVAudioPCMBuffers. I'd like to synchronize some events with the playback. For example if the audio's frame position is >= some point && less than some point trigger some code.
So I'm looking at - (void)installTapOnBus:(AVAudioNodeBus)bus bufferSize:(AVAudioFrameCount)bufferSize format:(AVAudioFormat * __nullable)format block:(AVAudioNodeTapBlock)tapBlock;
Now I have frame positions calculated (predetermined before audio is scheduled I already made all necessary computations) . So I just need to fire code at certain points during playback:
[playerNode installTapOnBus:bus
bufferSize:bufferSize
format:format
block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) {
//Inspect current audio here and fire...
}];
[playerNode scheduleBuffer:fullbuffer
atTime:startTime
options:0
completionCallbackType:AVAudioPlayerNodeCompletionDataPlayedBack
completionHandler:^(AVAudioPlayerNodeCompletionCallbackType callbackType)
{
// some code is here, not important to this question.
}];
The problem I'm having is figuring out at what point in full buffer I'm at within the tap block. The tap block passes chunks (not the full audio buffer). I tried using the when parameter of the block to calculate the frame position relative to the entire audio but have be unsuccessful so far. I'm assuming the when parameter is relative to the buffer passed in the tap block (not my entire audio buffer I scheduled).
Not installing a tap and just using a timer before scheduling my fullBuffer has given me good results but I'd rather avoid using a timer if possible and use sample time.
Topic:
Media Technologies
SubTopic:
Audio
Tags:
AVAudioNode
AVAudioSession
AVAudioEngine
AVFoundation
Hi, I'm relatively new to iOS development and kindly ask for some feedback on a strategy to achieve this desired behavior in my app.
My Question:
What would be the best strategy for sound effect playback when an app is in the background with precise timing? Is this even possible?
Context:
I created a basic countdown timer app (targeting iOS 17 with Swift/SwiftUI.). Countdown sessions can last up to 30-60 mins. When the timer is started it progresses through a series of sub-intervals and plays a short sound for each one. I used AVAudioPlayer and everything works fine when the app is in the foreground. I'm considering switching to AVAudioEngine b/c precise timing is very important and the AIs tell me this would have better precision.
I'm already setting "App plays audio or streams audio/video using AirPlay" in my Plist, and have configured:
AVAudioSession.sharedInstance().setCategory(.playback, mode: .default, options: .mixWithOthers)
Curiously, when testing on my iPhone 13 mini, sounds sometimes still play when the app is in the background, but not always.
What I've considered:
Background Tasks: Would they make any sense for this use-case? Seems like not if the allowed time is short & limited by the system.
Pre-scheduling all Sounds: Not sure this would even work and seems like a lot of memory would be needed (could be hundreds of intervals).
ActivityKit Alerts: works but with a ~50ms delay which is too long for my purposes.
Pre-Render all SFX to 1 large audio file: Seems like a lot of work and processing time and probably not worth it. I hope there's a better solution.
I'd really appreciate any feedback.
I'm using an AVAudioConverter object to decode an OPUS stream for VoIP. The decoding itself works well, however, whenever the stream stalls (no more audio packet is available to decode because of network instability) this can be heard in crackling / abrupt stop in decoded audio. OPUS can mitigate this by indicating packet loss by passing a null pointer in the C-library to
int opus_decode_float (OpusDecoder * st, const unsigned char * data, opus_int32 len, float * pcm, int frame_size, int decode_fec), see https://opus-codec.org/docs/opus_api-1.2/group__opus__decoder.html#ga9c554b8c0214e24733a299fe53bb3bd2.
However, with AVAudioConverter using Swift I'm constructing an AVAudioCompressedBuffer like so:
let compressedBuffer = AVAudioCompressedBuffer(
format: VoiceEncoder.Constants.networkFormat,
packetCapacity: 1,
maximumPacketSize: data.count
)
compressedBuffer.byteLength = UInt32(data.count)
compressedBuffer.packetCount = 1
compressedBuffer.packetDescriptions!
.pointee.mDataByteSize = UInt32(data.count)
data.copyBytes(
to: compressedBuffer.data
.assumingMemoryBound(to: UInt8.self),
count: data.count
)
where data: Data contains the raw OPUS frame to be decoded.
How can I specify data loss in this context and cause the AVAudioConverter to output PCM data whenever no more input data is available?
More context:
I'm specifying the audio format like this:
static let frameSize: UInt32 = 960
static let sampleRate: Float64 = 48000.0
static var networkFormatStreamDescription =
AudioStreamBasicDescription(
mSampleRate: sampleRate,
mFormatID: kAudioFormatOpus,
mFormatFlags: 0,
mBytesPerPacket: 0,
mFramesPerPacket: frameSize,
mBytesPerFrame: 0,
mChannelsPerFrame: 1,
mBitsPerChannel: 0,
mReserved: 0
)
static let networkFormat =
AVAudioFormat(
streamDescription:
&networkFormatStreamDescription
)!
I've tried 1) setting byteLength and packetCount to zero and 2) returning nil but setting .haveData in the AVAudioConverterInputBlock I'm using with no success.