We’re evaluating a Bluetooth device that supports Hands Free Profile (HFP) as the “Hands-Free Unit”. You can think of this as a Bluetooth telephone headset. This device interacts with our iOS application.
We observed the following. The iPhone 17 HFP Wide-Band Speech (WBS) mSBC decoder requires the WBS packet (H2 header + mSBC frame) to be sent aligned. Aligned meaning, the H2 header must be first in every packet. The WBS packet cannot span multiple eSCO packets or else the iPhone will discard the audio. This is a different implementation than the iPad (iPad Pro 11-inch M4) , presumably due to Apple’s new N1 chip. In other words, we’ve identified that older iPhones and iPads do not require this alignment. They have implemented a stateful parser/decoder of the HFP WBS audio.
A quick picture to help illustrate. The iPhone 17 requires:
| Frame | Frame | Frame | Frame |
However, a Bluetooth implementation we are evaluating does:
| me Fra | me Fra | me Fra | me Fra |
Does Apple intend to keep this implementation and continue discarding audio frames that are not aligned?
Page 115 of the Bluetooth HFP 1.8 specification mentions at the bottom that this behavior is “left up to the implementation” but that the “synchronization header enables unaligned codec audio frames to be recovered by the receiving side.” We understand and acknowledge that one whole frame per eSCO packet is the intended, optimal method for delivering WBS mSBC audio for reduced jitter, latency, and memory usage. However, the more robust solution would be to maintain a stateful receiver as previously implemented.
Any input would be appreciated.