Hello everyone,
Our team is currently developing an iOS application requiring real-time audio communication and evaluating the most suitable frameworks. Options include CallKit, custom solutions using AVAudioEngine/Audio Units, and the PushToTalk framework.
Regarding the PushToTalk framework, we have some questions about its core design and capabilities that we'd appreciate clarification on from the community or Apple engineers.
Based on the PushToTalk framework documentation, its API design (e.g., methods like requestBeginTransmission, endTransmission which imply explicitly requesting transmission rights), and its system UI integration, it strongly appears oriented towards half-duplex communication scenarios, similar to traditional walkie-talkies where only one participant transmits audio at a time.
Is this understanding accurate? Is the PushToTalk framework's design strictly limited to managing half-duplex audio interactions? Or, does the framework itself also provide built-in mechanisms or APIs to manage simultaneous, bi-directional (full-duplex) audio streaming between participants?
To be clear, we are asking about the inherent capabilities of the PushToTalk framework itself. We understand it's possible to use PushToTalk for signaling and UI management, and separately implement the actual full-duplex audio stream using AVAudioEngine or other audio APIs. However, we want to confirm if the framework itself is designed to support or simplify full-duplex audio communication.
Have other developers investigated the specific limitations or capabilities of the PushToTalk framework regarding audio transmission modes (half-duplex vs. full-duplex)?
Are there any official documentation references or WWDC sessions that explicitly clarify the framework's support (or lack thereof) for full-duplex operation?
If PushToTalk is indeed limited to half-duplex, what are the generally accepted best practices for apps requiring full-duplex calls – transitioning directly to CallKit (where applicable) or building custom audio processing pipelines?
Clarifying this point is crucial for us to select the correct technology stack for our application. Any relevant insights, documentation pointers, or shared development experiences would be greatly appreciated.
Thank you for your help!
The PushToTalk framework supports three transmission modes:
- Full Duplex
- Half Duplex (the default)
- Listen Only
You can set the transmission mode using the PTChannelManager method: setTransmissionMode(_:channelUUID:completionHandler:)
If you set the transmission mode to full duplex, your app can receive incoming transmissions and broadcast outgoing transmissions simultaneously. However, if you are planning to implement something similar to a traditional phone call experience with one or more participants, CallKit may be a better fit for your use case.