Implement webrtc voice calls in the background

Question

Created Feb ’25

Replies 1

Boosts 0

Participants 2

I am developing an App that will enable voice calls between users through webrtc. When the user opens the App and switches the App to the background, the user will receive the incoming call notification through Silent Push Notifications (not PushKit). My question is as follows,

If set UIBackgroundModes to voip and do not use PushKit and CallKit, will this cause the background App to be unable to use webrtc voice calls (requires network, microphone, and audio permissions)?
Can I set UIBackgroundModes = audio combined with AVAudioSession playAndRecord instead of setting UIBackgroundModes to voip, so that I can use the microphone and audio in the background to implement webrtc voice calls?

Thanks for your help.

Boost

Answer 1

DTS Engineer OP

Apple

Feb ’25

I am developing an App that will enable voice calls between users through webrtc. When the user opens the App and switches the App to the background, the user will receive the incoming call notification through Silent Push Notifications (not PushKit). My question is as follows,

As a word of warning, Silent Push is NOT a reliable push delivery system. Indeed, what a voip push actually is is a Silent Push with the delivery behavior of a high priority alert push. I'm not sure why you're avoiding PushKit, but the article "Sending End-to-End Encrypted VoIP Calls" describes our recommended architecture for apps that cannot use PushKit.

In addition, if you also want to avoid CallKit, then there are two architecture:

The "legacy" option is to use standard user visible pushes for call notifications. The one note that there is that UNNotificationSound's defaultRingtoneSound and ringtoneSoundNamed will give you a sounds playback experience that's closer to the CallKit behavior.
On iOS 17.4+, LiveCommunicationKit is an alternative to CallKit which provides similar audio capabilities (for example, it interoperates with CallKit calls) without the same "phone UI" that CallKit implements. Note that it is our preferred solution in situations where legal issues prevent CallKit from being used.

Finally, if you're avoiding CallKit because it's "call" focus doesn't match your apps needs, then you may want to look at the PushToTalk framework and consider designing around it's requirements.

If set UIBackgroundModes to voip and do not use PushKit and CallKit, will this cause the background App to be unable to use webrtc voice calls (requires network, microphone, and audio permissions)?

No. Strictly speaking, the "voip" and "audio" background categories actually perform different functions in the voip lifecycle. More specifically, in our original architecture:

"voip" is how apps are notified of incoming calls. It also provides access to voip specific APIs like CallKit, LiveCommunicationKit, and voip pushes through PushKit.
"audio" is how apps are kept awake once a call is actually active.

CallKit has blurred the line between those two roles, since it requires #1 and is also entangled with the audio, however, the general division still exists.

That leads to here:

Can I set UIBackgroundModes = audio combined with AVAudioSession playAndRecord instead of setting UIBackgroundModes to voip, so that I can use the microphone and audio in the background to implement webrtc voice calls?

Yes, though that approach comes with some significant limitations:

The audio system will not allow any kind of recording session to activate from the background, so your app will need to come to the foreground to be able to actually start a call.
The voip audio session managed by our voip APIs is NOT identical to a standard playAndRecord session. The most serious issue here is that it has a higher interruption priority than any other audio session type, so ANY incoming call will immediately interrupt your playAndRecord session. Since that active audio session is how your app is staying awake, the practical effect is that any incoming call will immediately terminate your call. Note that this is the main problem then lead to the creation of CallKit.

Other side effects of using the standard playAndRecord session include:

The maximum volume of a standard playAndRecord session is noticeably lower than the voip audio session.
Your playAndRecord audio session follows the standard rules of audio session priority so, for example, setting your session as non-mixable also means that your app will be interrupted by other non-mixable clients (like Music.app). Using mixable can avoid that... but that now means your call is mixing with the other playback apps.

The summary of all this is that while it's technically possible to design calling app entirely around "audio", doing so requires accepting very significant limitations which cannot really be worked around or avoided.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

2