Push To Talk framework doesn't active audio session in background

We are trying to extend our app with Push To Talk functionality by integrating the Push To Talk framework. We are extensively testing what happens if the app is running in the foreground, in the background or not running at all.

When the app is in the foreground, and the user has joined a channel we maintain an open connection to our server. When a remote participant starts streaming audio, we immediately call setActiveRemoteParticipant on our PTChannelManager instance. The PTT system will than call our delegate's channelManager:didActivate audioSession method and we can successfully play the incoming audio.

When the app is not running at all, there is of course no active connection initially. When another participant starts talking we send a push notification. The PTT system will start our app in the background, call the incomingPushResult method on our delegate, after returning the remote participant the PTT framework will then call the channelmanager:didJoin delegate method which we will use to re-establish the server connection, the PTT framework then calls our channelManager:didActivate audioSession delegate method and we can then successfully play audio.

Now the problem. When the application was initially in the foreground and has an established server connection, we initially keep the server connection active when the app enters the background state, until a certain timeout or the system decides our app needs to be killed / removed from memory. This allows us to finish an incoming audio stream, quickly react on incoming responses etc. When we then receive an incoming audio stream after a certain delay (for example 5 seconds) we call the channelManager.setRemoteParticipant method (using try await syntax). This finishes successfully, without any error, however the channelManager:didActivate audioSession delegate method is never called. Manually setting up an audio session is not allowed either and returns an error.

Our current workaround for this issue is to disconnect the server connection as soon as the app goes into the background. This will make sure our server sends a push notification, which is successful in activating the audio session after which we can play audio. However, this means we need to re-establish the connection which will introduce an unnecessary delay before we can start playback (and currently means we loose some audio). This also means we need to do extra checks when going to the background to make sure there is no active incoming stream. After each incoming stream we have to check again if we are in the background and disconnect immediately to make sure we get a push notification next time. This can of course also lead to race conditions in an active conversation where we might need to disconnect between incoming streams and if we don't do this in time we might never get an activated audio session.

Now this might be by design, as Apple might not want us to keep the server connection active when the application enters the background state. But if that's the case I would expect the channelManager.setRemoteParticipant method to throw an error, but it doesn't. It returns successfully after which we would expect the audio session to get activated as well. So maybe we are not setting the capabilities of our project correctly (we might need other background permissions as well, although we already experimented with that), or we need to do something else to make this work?

I've created an example app to demonstrate my problem. It can be found here: https://github.com/egeniq/ptt-audio-activation-test-ios

To reproduce, do the following:

  • Change team selection in Xcode.
  • Run the app.
  • Choose join.
  • Tap on the "Set remote participant NOW" button.
  • See in the log output that the audio session gets activated.
  • Tap on the "Clear remote participant" button.
  • See in the log output that the audio session gets deactivated.
  • Tap on the "Set remote participant after 5s" button.
  • Immediately go back to your phone's homescreen.
  • After 5s see in the log output that the remote participant is successfully set.
  • However also note that no audio session activation occurs.
Push To Talk framework doesn't active audio session in background
 
 
Q