CallKit speaker problem

I’m developing a VoIP app that uses Linphone and CallKit. Everything works as expected until the user enables the speaker on the native CallKit screen. After that, all subsequent calls start with the speaker already on. Even if I call AVAudioSession.sharedInstance().overrideOutputAudioPort(.none), it gets overridden when the call starts (when Linphone begins playing the ringtone). I tested this behavior in WhatsApp, and it seems to work correctly there.

I tested this behavior in WhatsApp, and it seems to work correctly there.

Does it work correctly in Speakerbox? Speakerbox is effectively our "reference implementation" for CallKit, so that's the first question I'll nearly always ask. Critically, if it DOESN'T have happen in Speakerbox, then the issue is being caused by "something" your app is doing.

Continuing with the assumption that Speakerbox works correctly...

I’m developing a VoIP app that uses Linphone and CallKit. Everything works as expected until the user enables the speaker on the native CallKit screen. After that, all subsequent calls start with the speaker already on. Even if I call AVAudioSession.sharedInstance().overrideOutputAudioPort(.none),

I can't really tell you exactly what the problem might be, however, what generally causes these kind of audio issues are the complicated interactions between a number of different factors:

  1. The audio system has a rigid but poorly documented set of "rules" about exactly what changes you can make to an active audio session.

  2. The audio session a CallKit call uses is NOT one of the standard audio session categories and cannot be activated through the standard audio API. Note that you can actually hear the difference, as the max volume of a CallKit call is noticeably louder than playAndRecord.

  3. All CallKit apps have the "audio" background category, which means that the audio system will allow activation that would otherwise be blocked.

  4. CallKit operates in terms of managing 1 or more “calls”, but the audio system itself operates at the process level, which means the two layers don't always line up "cleanly". For example, if your app has two calls active at the same time "switching between calls" doesn't really change anything as far as the audio system is concerned.

The critical guidance here is basically "do what Speakerbox does". Namely:

  • Do all session configuration before reporting the call and don't changes that configuration after that point.

  • Do NOT directly activate the audio session, let CallKit handle that.

Most issues here are caused by violating that last rule, as the interactions between 2 & 3 above mean that calling setActive often "works" (meaning, the audio session activate and you're able to play and record), but ends up leaving the audio systems in a misconfigured/unexpected state, which then causes other failures "somewhere else".

Finally, a concern here:

it gets overridden when the call starts (when Linphone begins playing the ringtone).

The words “begins playing the ringtone" make me very nervous. CallKit expect the ringtone to be set by providing a file name to CXProviderConfiguration.ringtoneSound so that it can play that file* outside of your process (instead of having your app doing it’s own playback). Apps have tried using #3 above to do their own ringtone playback "directly", but doing so is inherently dangerous and tricky, as it tends to run in to problems with #2 and #3. My general advice is "don't do this".

*As a side note, I believe we actually support the same locations UNNotificationSound does, so this API is much more configurable than it might appear.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

CallKit speaker problem
 
 
Q