failed to start VoiceProcessingIO AudioUnit on VisionPro (os 1.1.1)

Hello,

We are trying to use an audio calling functionality for visionOS with no success since the update of visionOS. We do not used CallKit for this flow.

We set the AudioSession as followed:

[sessionInstance setCategory:AVAudioSessionCategoryPlayAndRecord mode:AVAudioSessionModeVoiceChat options: (AVAudioSessionCategoryOptionAllowBluetooth | AVAudioSessionCategoryOptionAllowBluetoothA2DP | AVAudioSessionCategoryOptionMixWithOthers) error:&error_];

We are creating our Audio unit as followed:

AudioComponentDescription desc_;
        desc_.componentType         = kAudioUnitType_Output;
        desc_.componentSubType      = kAudioUnitSubType_VoiceProcessingIO;
        desc_.componentManufacturer = kAudioUnitManufacturer_Apple;
        desc_.componentFlags        = 0;
        desc_.componentFlagsMask    = 0;
        AudioComponent comp_ = AudioComponentFindNext(NULL, &desc_);
        
        IMSXThrowIfError(AudioComponentInstanceNew(comp_, &_audioUnit),"couldn't create a new instance of Apple Voice Processing IO.");

        UInt32 one_ = 1;
        IMSXThrowIfError(AudioUnitSetProperty(self.audioUnit, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Input, audioUnitElementIOInput, &one_, sizeof(one_)), "could not enable input on Apple Voice Processing IO");
        IMSXThrowIfError(AudioUnitSetProperty(self.audioUnit, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Output, audioUnitElementIOOutput, &one_, sizeof(one_)), "could not enable output on Apple Voice Processing IO");

        IMSTagLogInfo(kIMSTagAudio, @"Rate: %ld", _rate);
        bool isInterleaved = _channel == 2 ? true : false;
        self.ioFormat = CAStreamBasicDescription(_rate, _channel, CAStreamBasicDescription::kPCMFormatInt16, isInterleaved);
        IMSXThrowIfError(AudioUnitSetProperty(self.audioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, 0, &_ioFormat, sizeof(self.ioFormat)), "couldn't set the input client format on Apple Voice Processing IO");
        IMSXThrowIfError(AudioUnitSetProperty(self.audioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Output, 1, &_ioFormat, sizeof(self.ioFormat)), "couldn't set the output client format on Apple Voice Processing IO");
        
        UInt32 maxFramesPerSlice_ = 4096;
        IMSXThrowIfError(AudioUnitSetProperty(self.audioUnit, kAudioUnitProperty_MaximumFramesPerSlice, kAudioUnitScope_Global, 0, &maxFramesPerSlice_, sizeof(UInt32)), "couldn't set max frames per slice on Apple Voice Processing IO");
        UInt32 propSize_ = sizeof(UInt32);
        IMSXThrowIfError(AudioUnitGetProperty(self.audioUnit, kAudioUnitProperty_MaximumFramesPerSlice, kAudioUnitScope_Global, 0, &maxFramesPerSlice_, &propSize_), "couldn't get max frames per slice on Apple Voice Processing IO");
        
        AURenderCallbackStruct renderCallbackStruct_;
        renderCallbackStruct_.inputProc       = playbackCallback;
        renderCallbackStruct_.inputProcRefCon = (__bridge void *)self;
        IMSXThrowIfError(AudioUnitSetProperty(self.audioUnit, kAudioUnitProperty_SetRenderCallback, kAudioUnitScope_Output, 0, &renderCallbackStruct_, sizeof(renderCallbackStruct_)), "couldn't set render callback on Apple Voice Processing IO");
        
        AURenderCallbackStruct inputCallbackStruct_;
        inputCallbackStruct_.inputProc       = recordingCallback;
        inputCallbackStruct_.inputProcRefCon = (__bridge void *)self;
        IMSXThrowIfError(AudioUnitSetProperty(self.audioUnit, kAudioOutputUnitProperty_SetInputCallback, kAudioUnitScope_Input, 0, &inputCallbackStruct_, sizeof(inputCallbackStruct_)), "couldn't set render callback on Apple Voice Processing IO");

And as soon as we try to start the AudioUnit we have the following error:

PhaseIOImpl.mm:1514 phaseextio@0x107a54320: failed to start IO directions 0x3, num IO streams [1, 1]: Error Domain=com.apple.coreaudio.phase Code=1346924646 "failed to pause/resume stream 6B273F5B-D6EF-41B3-8460-0E34B00D10A6" UserInfo={NSLocalizedDescription=failed to pause/resume stream 6B273F5B-D6EF-41B3-8460-0E34B00D10A6}

We do not use PHASE framework on our side and the error is not clear to us nor documented anywhere.

We also try to use a AudioUnit that only do Speaker witch works perfectly, but as soon as we try to record from an AudioUnit the start failed as well with the error AVAudioSessionErrorCodeCannotStartRecording

We suppose that somehow inside PHASE an IO VOIP audio unit is running that prevent us from stoping/killing it when we try to create our own, and stop the whole flow.

It used to work on visonOS 1.0.1

Regards, Summit-tech