Post not yet marked as solved
I've currently received a task that requires to evaluate the possibility, as title, of recording via mic on BLE headset and play sound via built-in speaker at same time on iOS.
I've done implementing forcing audio device set to built-in speaker whenever the BLE headset is connected/disconnected. It works if both mic/speaker need to be set to built-in one. But after days of search and try, I found that it is not possible to make mic/speaker set separately. Even specifying input device on AVAudioEngine is supported only on MacOS, not iOS.
Can anyone or any technician give me a persuading answer about "Possibility of record via mic on BLE headset and play sound via built-in speaker at same time"?
Post not yet marked as solved
How can you add a live audio player to Xcode where they will have a interactive UI to control the audio and they will be able to exit out of the app and or turn their device off and it will keep playing? Is their a framework or API that will work for this? Thanks! Really need help with this…. 🤩 I have looked everywhere and haven’t found something that works….
Post not yet marked as solved
How can you add a live audio player to Xcode where they will have a interactive UI to control the audio and they will be able to exit out of the app and or turn their device off and it will keep playing? Is their a framework or API that will work for this? Thanks! Really need help with this…. 🤩
Post not yet marked as solved
I'm unclear on how to access the inward facing microphone in the AirPods Pro (not the outward facing one). If this is possible, can you point me in the right direction?
More context is that there is a ticking noise coming a spasm inside someone's ears that I'd like to try canceling for them.
The standard AirPods Pro noise cancellation modes don't have any effect on the sound.
I know latency may be too high to do this on the phone with a custom app, but thought if I could reach the point of that being the problem, then I could experiment with predictive algorithms.
Thank you in advance for ideas or recommendations.
Post not yet marked as solved
This just seems like a useful thing to have when rendering audio. For example, let's say you have an effect that pitches audio up/down. That typically requires that you know the sample rate of the incoming audio. The way I do this right now is just to save the sample rate after the AUAudioUnit's render resources have been allocated, but being provided this info on a per-render-callback basis seems more useful.
Another use case is for AUAudioUnit's on the input chain. Since the format for connections must match the hardware format, you can no longer explicitly set the format that you expect the audio to come in at. You can check the sample rate on the AVAudioEngine's input node or the sample rate on the AVAudioSession singleton, but when you are working with the audio from within the render callback, you don't want to be accessing those methods due to the possibility they are blocking calls. This is especially true when using the AVAudioSinkNode where you don't have the ability to set the sample rate before the underlying node's render resources are allocated.
Am I missing something here, or does this actually seem useful?
Post not yet marked as solved
When using VoiceProcessingIO audio unit with voicechat audio session mode to have echo cancellation, I can't play audio in stereo, it only allows mono audio.
How can I enable stereo playback with echo cancellation?
Is it some kind of limitation? since it isn't mentioned anywhere in the documentation.
Post not yet marked as solved
I’m developing a voice communication app for the iPad with both playback and record and using AudioUnit of type kAudioUnitSubType_VoiceProcessingIO to have echo cancellation.
When playing the audio before initializing the recording audio unit, volume is high. But if I'm playing the audio after initializing the audio unit or when switching to remoteio and then back to vpio the playback volume is low.
It seems like a bug in iOS, any solution or workaround for this? Searching the net I only found this post without any solution: https://developer.apple.com/forums/thread/671836
Post not yet marked as solved
I receive a buffer from[AVSpeechSynthesizer convertToBuffer:fromBuffer:] and want to schedule it on an AVPlayerNode.
The player node's output format need to be something that the next node could handle and as far as I understand most nodes can handle a canonical format.
The format provided by AVSpeechSynthesizer is not something thatAVAudioMixerNode supports.
So the following:
AVAudioEngine *engine = [[AVAudioEngine alloc] init];
playerNode = [[AVAudioPlayerNode alloc] init];
AVAudioFormat *format = [[AVAudioFormat alloc]
initWithSettings:utterance.voice.audioFileSettings];
[engine attachNode:self.playerNode];
[engine connect:self.playerNode to:engine.mainMixerNode format:format];
Throws an exception:
Thread 1: "[[busArray objectAtIndexedSubscript:(NSUInteger)element] setFormat:format error:&nsErr]: returned false, error Error Domain=NSOSStatusErrorDomain Code=-10868 \"(null)\""
I am looking for a way to obtain the canonical format for the platform so that I can use AVAudioConverter to convert the buffer.
Since different platforms have different canonical formats, I imagine there should be some library way of doing this. Otherwise each developer will have to redefine it for each platform the code will run on (OSX, iOS etc) and keep it updated when it changes.
I could not find any constant or function which can make such format, ASDB or settings.
The smartest way I could think of, which does not work:
AudioStreamBasicDescription toDesc;
FillOutASBDForLPCM(toDesc, [AVAudioSession sharedInstance].sampleRate,
2, 16, 16, kAudioFormatFlagIsFloat, kAudioFormatFlagsNativeEndian);
AVAudioFormat *toFormat = [[AVAudioFormat alloc] initWithStreamDescription:&toDesc];
Even the provided example for iPhone, in the documentation linked above, uses kAudioFormatFlagsAudioUnitCanonical and AudioUnitSampleType which are deprecated.
So what is the correct way to do this?
Post not yet marked as solved
I tried to run multiple demos utilising spatial audio. However no matter what I do, I only get 2 channel output. Which is also confirmed by calling:
let numHardwareOutputChannels = gameView.audioEngine.outputNode.outputFormat(forBus: 0).channelCount
My appleTV is connected to DolbyAtmos capable audio system which works just fine.
So my question is more less - how to convince TVOS app that my appleTV has multichannel output ?!
Post not yet marked as solved
Working on a recording app. So I started from scratch, and basically jump right into recording. I made sure to add the Privacy - Microphone Usage Description string.
What strikes me as odd, is that the app launches straight into recording. No alert comes up the first time asking the user for permission, which I thought was the norm.
Have I misunderstood something?
override func viewDidLoad() {
super.viewDidLoad()
record3()
}
func record3() {
print ("recording")
let node = audioEngine.inputNode
let recordingFormat = node.inputFormat(forBus: 0)
var silencish = 0
var wordsish = 0
makeFile(format: recordingFormat)
node.installTap(onBus: 0, bufferSize: 8192, format: recordingFormat, block: {
[self]
(buffer, _) in
do {
try audioFile!.write(from: buffer);
x += 1;
if x > 300 {
print ("it's over sergio")
endThis()
}
} catch {return};})
audioEngine.prepare()
do {
try audioEngine.start()
} catch let error {
print ("oh catch \(error)")
}
}
Post not yet marked as solved
I'm using an AVAudioConverter object to decode an OPUS stream for VoIP. The decoding itself works well, however, whenever the stream stalls (no more audio packet is available to decode because of network instability) this can be heard in crackling / abrupt stop in decoded audio. OPUS can mitigate this by indicating packet loss by passing a null pointer in the C-library to
int opus_decode_float (OpusDecoder * st, const unsigned char * data, opus_int32 len, float * pcm, int frame_size, int decode_fec), see https://opus-codec.org/docs/opus_api-1.2/group__opus__decoder.html#ga9c554b8c0214e24733a299fe53bb3bd2.
However, with AVAudioConverter using Swift I'm constructing an AVAudioCompressedBuffer like so:
let compressedBuffer = AVAudioCompressedBuffer(
format: VoiceEncoder.Constants.networkFormat,
packetCapacity: 1,
maximumPacketSize: data.count
)
compressedBuffer.byteLength = UInt32(data.count)
compressedBuffer.packetCount = 1
compressedBuffer.packetDescriptions!
.pointee.mDataByteSize = UInt32(data.count)
data.copyBytes(
to: compressedBuffer.data
.assumingMemoryBound(to: UInt8.self),
count: data.count
)
where data: Data contains the raw OPUS frame to be decoded.
How can I specify data loss in this context and cause the AVAudioConverter to output PCM data whenever no more input data is available?
More context:
I'm specifying the audio format like this:
static let frameSize: UInt32 = 960
static let sampleRate: Float64 = 48000.0
static var networkFormatStreamDescription =
AudioStreamBasicDescription(
mSampleRate: sampleRate,
mFormatID: kAudioFormatOpus,
mFormatFlags: 0,
mBytesPerPacket: 0,
mFramesPerPacket: frameSize,
mBytesPerFrame: 0,
mChannelsPerFrame: 1,
mBitsPerChannel: 0,
mReserved: 0
)
static let networkFormat =
AVAudioFormat(
streamDescription:
&networkFormatStreamDescription
)!
I've tried 1) setting byteLength and packetCount to zero and 2) returning nil but setting .haveData in the AVAudioConverterInputBlock I'm using with no success.
Post not yet marked as solved
I am working on music application where multiple wav files are scheduling within time frame. Everything is working perfect except one scenario where there is small beef is coming while scheduling player node again.
For example - one.wav is playing on PlayerNode1 and now I am rescheduling to second.wav after 2 second then there is small beep is coming. I have tried to stop node by checking isPlaying condition. Still it is not working. Am I doing anything wrong here.
if playerNode.isPlaying {
playerNode.stop()
}
playerNode.scheduleFile(audioFile, at: nil, completionHandler: nil)
playerNode.play()
I am using same player node for performance as there are 24 wav files that needs to be played in 1 minute so there is no point to keeping all player nodes.
How would stop beef while rescheduling new audio file for same player node?
I have shared link below to check issue.
https://drive.google.com/file/d/1FjZtLUj_wUp0LQPyjIwfJNy67HWUlt0I/view?usp=sharing
Expected result should be song continuity
Post not yet marked as solved
Is there any possible way to produce a system for my macOS program that will:
-Allow the user to pick whether my program will output its audio either to the system output or to an AirPlay destination?
-While doing so, offer the ability to control the volume of the asset currently playing? (AVPlayer 'volume' setter stops responding when connected to an AirPlay endpoint)
-Also allow me to attach a 10-band EQ to the output?
I tried to do this five years ago, in 2017, and expected the ecosystem to have improved by now. While AVRoutePickerView and AVPlayer are user-friendly and convenient, the fact that basic functionality like the volume control ceases functioning over AirPlay is quite frustrating. AVAudioPlayer seems like it might offer this functionality, but only on iOS and not on macOS!
I am basically only trying to offer the same AirPlay controls that Music.app does. Is this really so difficult?
Post not yet marked as solved
I have my Swift app that records audio in chunks of multiple files, each M4A file is approx 1 minute long. I would like to go through those files and detect silence, or the lowest level.
While I am able to read the file into a buffer, my problem is deciphering it. Even with Google, all it comes up with is "audio players" instead of sites that describe the header and the data.
Where can I find what to look for? Or even if I should be reading it into a WAV file? But even then I cannot seem to find a tool, or a site, that tells me how to decipher what I am reading.
Obviously it exists, since Siri knows when you've stopped speaking. Just trying to find the key.
Post not yet marked as solved
Hello,
I am trying to use AVAudioFile to save audio buffer to .wav file. The buffer is of type [Float].
Currently I am able to successfully create the .wav files and even play them, but they are blank - I cannot hear any sound.
private func saveAudioFile(using buffer: [Float]) {
let fileUrl = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask).first!.appendingPathComponent("\(UUID().uuidString).wav")
let fileSettings = [
AVFormatIDKey: Int(kAudioFormatLinearPCM),
AVSampleRateKey: 15600,
AVNumberOfChannelsKey: 1,
AVEncoderAudioQualityKey: AVAudioQuality.high.rawValue
]
guard let file = try? AVAudioFile(forWriting: fileUrl, settings: fileSettings, commonFormat: .pcmFormatInt16, interleaved: true) else {
print("Cannot create AudioFile")
return
}
guard let bufferFormat = AVAudioFormat(settings: settings) else {
print("Cannot create buffer format")
return
}
guard let outputBuffer = AVAudioPCMBuffer(pcmFormat: bufferFormat, frameCapacity: AVAudioFrameCount(buffer.count)) else {
print("Cannot create output buffer")
return
}
for i in 0..<buffer.count {
outputBuffer.int16ChannelData!.pointee[i] = Int16(buffer[i])
}
outputBuffer.frameLength = AVAudioFrameCount(buffer.count)
do {
try file.write(from: outputBuffer)
} catch {
print(error.localizedDescription)
print("Write to file failed")
}
}
Where should I be looking first for the problem? Is it format issue?
I am getting the data from the microphone with the AVAudioEngine.
Its format is created like this:
let outputFormat = AVAudioFormat(commonFormat: .pcmFormatInt16, sampleRate: Double(15600), channels: 1, interleaved: true)!
And here is the installTap implementation with the buffer callback:
input.installTap(onBus: 0, bufferSize: AVAudioFrameCount(sampleRate*2), format: inputFormat) { (incomingBuffer, time) in
DispatchQueue.global(qos: .background).async {
let pcmBuffer = AVAudioPCMBuffer(pcmFormat: outputFormat, frameCapacity: AVAudioFrameCount(outputFormat.sampleRate * 2.0))
var error: NSError? = nil
let inputBlock: AVAudioConverterInputBlock = { inNumPackets, outStatus in
outStatus.pointee = AVAudioConverterInputStatus.haveData
return incomingBuffer
}
formatConverter.convert(to: pcmBuffer!, error: &error, withInputFrom: inputBlock)
if error != nil {
print(error!.localizedDescription)
}
else if let pcmBuffer = pcmBuffer, let channelData = pcmBuffer.int16ChannelData {
let channelDataPointer = channelData.pointee
self.buffer = stride(from: 0, to: self.windowLengthSamples, by: 1).map { Float(channelDataPointer[$0]) / 32768.0 }
onBufferUpdated(self.buffer)
}
}
}
The onBufferUpdated is the block that provides [Float] for the saveAudioFile method above.
I have tried some experiements with different output formats, but that ended up with unplayable audio files.
Post not yet marked as solved
Hello,
I would like to create a VoIP App I want to build a sound visualizer for the voice of the other party.
example:
https://medium.com/swlh/swiftui-create-a-sound-visualizer-cadee0b6ad37
The call function was implemented using the call kit.
(Based on twilio's quick-start)
https://jp.twilio.com/docs/voice/sdks/ios/get-started
https://github.com/twilio/voice-quickstart-ios
After importing AVFoundation, have to need routing the AVAudioengine mixer.
I have no idea how to put voice in the mixer.
Any guidance is appreciated!
Thank you!
Post not yet marked as solved
Hello,
Am starting to work with/learn the AVAudioEngine.
Currently am at the point where I would like to be able read an audio file of a speech and determine if there are any moments of silence in the speech.
Does this framework provide any such properties, such as power lever, decibels, etc. that I can use in finding long enough moments of silence?
I am trying to save the buffer from my installTap to a file. I do it in chunks of 10 so I'll get a bigger file. When I try to play the written file (from the simulator's directory) QuickTime says that it's not compatible.
I have examined the bad m4a file and a working one. There are a lot of zero's in the bad file at the beginning followed by a lot of data. However both files appears to have the same header.
A lot of people mention that I have to nil the AudioFile, but:
audioFile = nil
is not a valid syntax, nor can I file a close method in AudioFile.
Here's the complete code, edited into one working file:
import UIKit
import AVFoundation
class ViewController: UIViewController {
let audioEngine = AVAudioEngine()
var audioFile = AVAudioFile()
var x = 0
override func viewDidLoad() {
super.viewDidLoad()
record()
// Do any additional setup after loading the view.
}
func makeFile(format: AVAudioFormat) {
let paths = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask).first
do {
_ = try FileManager.default.contentsOfDirectory(at: paths!, includingPropertiesForKeys: nil)
} catch { print ("error")}
let destinationPath = paths!.appendingPathComponent("audioT.m4a")
print ("\(destinationPath)")
do {
audioFile = try AVAudioFile(forWriting: destinationPath,
settings: format.settings)
print ("file created")
} catch { print ("error creating file")}
}
func record(){
let node = audioEngine.inputNode
let recordingFormat = node.inputFormat(forBus: 0)
makeFile(format: recordingFormat)
node.installTap(onBus: 0, bufferSize: 8192, format: recordingFormat, block: { [self]
(buffer, _) in
do {
try audioFile.write(from: buffer);
print ("buffer filled");
x += 1;
print("wrote \(x)")
if x > 9 {
endThis()
}
} catch {return};})
audioEngine.prepare()
do {
try audioEngine.start()
} catch let error {
print ("oh catch")
}
}
func endThis(){
audioEngine.stop()
audioEngine.inputNode.removeTap(onBus: 0)
}
}
Post not yet marked as solved
Am trying to go from the installTap straight to AVAudioFile(forWriting:
I call:
let recordingFormat = node.outputFormat(forBus: 0)
and I get back :
<AVAudioFormat 0x60000278f750: 1 ch, 48000 Hz, Float32>
But AVAudioFile has a settings parameter of [String : Any] and am curious of how to place those values into recording the required format.
Hopefully these are the values I need?
Expanding a speech to text demo, and while it works, I am still trying to learn Swift. Is .installTap the Swift version of a C callback function?
From what I interpret here, every time the buffer becomes full, the code in between the last { } runs, as well, the code below it is also run.
It almost feels like a callback combined with a GOTO line from basic.
yes, it works, but I'd like to understand that I am getting the flow of the code correctly.
func startSpeechRecognition (){
let node = audioEngine.inputNode
let recordingFormat = node.outputFormat(forBus: 0)
node.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, _) in self.request.append(buffer) }
audioEngine.prepare()
do {
try audioEngine.start()
} catch let error {
...
}
guard let myRecognition = SFSpeechRecognizer() else {
...
return
}
if !myRecognition.isAvailable {
...
}
task = speechRecognizer?.recognitionTask(with: request, resultHandler: { (response, error) in guard let response = response else {
if error != nil {
print ("\(String(describing: error.debugDescription))")
} else {
print ("problem in repsonse")
}
return
}
let message = response.bestTranscription.formattedString
print ("\(message)")
})
}