Issue related to APNS is delivering expired voip push notification.

Hi, am facing an issue related to voip push notifications getting delivered 1-2 hours after apns-expiration to 0 and apns-priority to 10.

I had raised a similar post got a reply that it may be due to network delay. But network delay can cause the delivery of voip push to be delayed only by few seconds or minutes. But in our case voip push is getting delivered hours after the voip call was attempted.

Steps to reproduce:

  1. Put our voip app in background and lock iPhone. As app is put in background, socket connections gets disconnected from server.

Now if a caller makes call to this app, the call should be delivered through voip push. 2) Voip push should ideally be received even if app is in background and iPhone is locked. It is connected to a good wifi network. But it does not receive the voip push. 3) After 1-2 hours user unlocks iPhone and opens voip app. As soon as user opens app, the voip push is received and phone starts ringing.

Answered by DTS Engineer in 831939022

So there are actually two different issues at play here. The broad issue here:

Hi, am facing an issue related to voip push notifications getting delivered 1-2 hours after apns-expiration to 0 and apns-priority to 10.

...is a long standing issue with APNS. Basically, if a push is prepped for delivery (because the server believes delivery is possible) but delivery fails, that push can end up being "left" in the queue and then delivered much later if/when the client device is again able to reconnect.

I'd appreciate you taking the time to file a bug on this and then posting the bug number back here. I'd like to see us address this but, unfortunately, there have been a lot fewer bugs filed about it than you'd expect.

The second issue is here:

Now if a caller makes call to this app, the call should be delivered through voip push. 2) Voip push should ideally be received even if app is in background and iPhone is locked. It is connected to a good wifi network. But it does not receive the voip push.

The exact details vary, but the general issue here is that the WiFi network is broken. Exactly HOW it's broken varies but here are two real world examples:

  1. Some WiFi access points can be configured to ask client to disassociate after a fixed time period. If that happens while the device is locked and sleeping, then iOS can end up disconnecting from the AP, leaving it with no network connection.

  2. I've scene a few cases where NAT servers were intentionally breaking connections without notifying both ends that the connection was closed. Typically this means they disconnect the WAN side (APNS server side) but don't notify the LAN side (iOS device). APNS may or may not (depending on what the NAT server did) know there is a problem, but it can't do anything about it, since only the client can reach out.

In both cases, waking/unlocking the device gets "everything" working again as the device reconnects to WiFi and/or starts realizing that connections aren't working properly.

My big warning here is that, in my experience, Wifi only VOIP does NOT "just work". It LOOKS like (and often does) it works fine on small scale test networks and/or well configured and surveyed enterprise networks. It doesn't work well on many enterprise/small business networks. In basically "all" the cases I've looked at, this is because the network did NOT in fact work properly and APNS just happened to be the thing they noticed was broken.

I do have one question here:

  1. After 1-2 hours user unlocks iPhone and opens voip app. As soon as user opens app, the voip push is received and phone starts ringing.

Is it necessary to specifically open the app or it just unlocking the device? If it requires opening the app, then that could indicate that there was an issue with your app as well.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

So there are actually two different issues at play here. The broad issue here:

Hi, am facing an issue related to voip push notifications getting delivered 1-2 hours after apns-expiration to 0 and apns-priority to 10.

...is a long standing issue with APNS. Basically, if a push is prepped for delivery (because the server believes delivery is possible) but delivery fails, that push can end up being "left" in the queue and then delivered much later if/when the client device is again able to reconnect.

I'd appreciate you taking the time to file a bug on this and then posting the bug number back here. I'd like to see us address this but, unfortunately, there have been a lot fewer bugs filed about it than you'd expect.

The second issue is here:

Now if a caller makes call to this app, the call should be delivered through voip push. 2) Voip push should ideally be received even if app is in background and iPhone is locked. It is connected to a good wifi network. But it does not receive the voip push.

The exact details vary, but the general issue here is that the WiFi network is broken. Exactly HOW it's broken varies but here are two real world examples:

  1. Some WiFi access points can be configured to ask client to disassociate after a fixed time period. If that happens while the device is locked and sleeping, then iOS can end up disconnecting from the AP, leaving it with no network connection.

  2. I've scene a few cases where NAT servers were intentionally breaking connections without notifying both ends that the connection was closed. Typically this means they disconnect the WAN side (APNS server side) but don't notify the LAN side (iOS device). APNS may or may not (depending on what the NAT server did) know there is a problem, but it can't do anything about it, since only the client can reach out.

In both cases, waking/unlocking the device gets "everything" working again as the device reconnects to WiFi and/or starts realizing that connections aren't working properly.

My big warning here is that, in my experience, Wifi only VOIP does NOT "just work". It LOOKS like (and often does) it works fine on small scale test networks and/or well configured and surveyed enterprise networks. It doesn't work well on many enterprise/small business networks. In basically "all" the cases I've looked at, this is because the network did NOT in fact work properly and APNS just happened to be the thing they noticed was broken.

I do have one question here:

  1. After 1-2 hours user unlocks iPhone and opens voip app. As soon as user opens app, the voip push is received and phone starts ringing.

Is it necessary to specifically open the app or it just unlocking the device? If it requires opening the app, then that could indicate that there was an issue with your app as well.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Hi,

Thanks for replying. I filed a bug FB17104567.

And regarding the question you asked. It is not necessary to open the app. Just reconnecting to internet is enough. The voip push is delivered immediately after internet is available. No need to unlock iPhone or open the app.

Thanks for replying. I filed a bug FB17104567.

Thank you and your bug did raise one issue that you should address on your side. In your bug, you said:

"Ideally above mentioned issue should not happen as we have set 'apns-expiry' to a short time period like 30 seconds."

All voip pushes should be sent with "apns-expiry=0". That's because:

  1. Using a non-zero expiration only has a marginal impact on the total reported calls. Basically, you go from ring on "all devices currently connected to APNS" to ringing on "all currently connected devices" + "devices that happened to connect in the last + X". That is some additional calls, but unless lots of your users have terrible connectivity it won't be THAT many.

  2. Related to that point, the number of additional calls that SUCCEED (meaning, the user is able to connect and talk to the other person) will be significantly less than the new rings created by #1. If you stare at enough failed call traces, you quickly notice that it's fairly common for the push connection to cycle between connect/fail/connect/fail multiple times before establishing a stable connection or failing completely again. The connection failed because the device was out of cellular range and it's common for connectivity to be weak/unstable as the device comes back into range.

  3. Because of how pushes are queued for delivery (which I was causes this issue in the first place), any value besides "apns-expiry=0" will dramatically increase the possibility of expired push deliver.

In other words, increasing the expiration time actually ends up significantly increasing the number of failed calls (either because of poor connectivity or push expiration) and only modestly increasing connected calls (if at all).

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Following up on this, iOS 26.4 has introduced new API to improve the handling of situations where call reporting is not required. See this forum post for more details.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Following up on this, iOS 26.4 has introduced new API to improve the handling of situations where call reporting is not required. See this forum post for more details.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

I can confirm we are also encountering same issue... We have tried setting apns-expiration to zero and also an epoch 5 seconds in future in UTC... The exact test is put an iPhone off grid (no WIFI / no 5g). Trigger a apns voip pushkit with expiration set as above. Wait 5 minutes... Reconnect WIFI.. Incoming call arrives....Begging Apple to fix this. VoIP APNS by defintiion is realtime. No point delivering late ones (in contrast to SMS apns).. I have also opened a bug for this . Case-ID: 20013249

I can confirm we are also encountering same issue... We have tried setting apns-expiration to zero and also an epoch 5 seconds in future in UTC... The exact test is put an iPhone off grid (no WIFI / no 5g). Trigger a apns voip pushkit with expiration set as above. Wait 5 minutes... Reconnect WIFI.. Incoming call arrives....

Yes. This has always been an issue with PushKit/CallKit.

Begging Apple to fix this. VoIP APNS by defintiion is realtime. No point delivering late ones (in contrast to SMS apns).. I have also opened a bug for this . Case-ID: 20013249

We have. The solution is the new PushKit delegate I described in this post.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Many thanks for the feedback. And noted on new feature. My questions however is what is point of apns-expiration if Apple does not adhere to it... It puts extra pressure on far end device to accept or reject a notification on basis the passed boolean (which may also be wrong).... It would be nice if apple can make a server side adjustment to ensure that it does gated delivery on expiration will solve major headache.... I note apps like WhatsApp have had to let the phone ring and cancel (because you did have the feature).. As developers the less change to the container app, the less impact on customers..

Many thanks for the feedback. And noted on the new feature. My question, however, is what is the point of APNs expiration if Apple does not adhere to it...

In theory, it allows our servers to discard old notifications that are no longer relevant; however, there has always been a trade-off between:

  • The CPU/overhead cost of "scanning" the queue for expired notifications.

vs.

  • The bandwidth cost of delivering "unnecessary" notifications.

...and, in practice, server-side CPU has always been far more valuable than bandwidth.

Similarly, on the device side, the system has generally preferred to deliver expired notifications instead of discarding them. Also, just to be clear, the issue you're mentioning here is that way APNs has behaved since it was originally introduced 16 years ago (iOS 3), which VoIP apps have been experiencing since they were transitioned to PushKit 8 years ago (iOS 8), and which has been an active problem for voip apps since we introduced the CallKit requirements 4 years (iOS 13) ago.

Nothing here is REMOTELY new, and, to be honest, the main reason it took so long to address is that there were FAR fewer developer bugs filed about this than you might think (<20). This is a situation I've been trying to improve for many years, and the biggest issue has always been that, as far as bug reports could show, most developers didn't seem to care that it was happening.

It puts extra pressure on the far-end device to accept or reject a notification on the basis of the passed boolean (which may also be wrong)…

What do you mean by "wrong"? Have you tried the new delegate?

In terms of our implementation, the "mustReport" property of the new "didReceiveIncomingVoIPPushWithPayload" delegate can basically never be "wrong". What actually drives that property is a pre-existing message property callservicesd has always (well, since iOS 13) included, which tells PKPushRegistry whether or not the app "must report" the call. That's the same property PKPushRegistry used to decide whether or not your app SHOULD crash for failing to report, so it can't really ever be "wrong".

It would be nice if Apple can make a server-side adjustment to ensure that it does gated delivery on expiration will solve the major headache....

I discussed it with the team, but that's not going to happen. The server impact would be much higher than it would seem, and, as far as I can tell, VoIP apps are the ONLY case in the system where expired push delivery actually creates any problem.

In addition, even with VoIP pushes, it's not clear what "expired" should actually mean. There's a clear point where it seems obvious that it should be dropped ("30m"), but what about a push that's 60s late? 30s? 10s?

My original recommendation was that we simply drop pushes at a fixed delay (likely 10s?), but I was persuaded that the risk of disrupting existing apps in new/unexpected ways was too high to justify the risk. I'll also note that the new delegate also addresses other issues, like whether or not an app has to report a call if/when it's on a current call.

I note apps like WhatsApp have had to let the phone ring and cancel (because you did have the feature)..

Yes, I'm well aware of that.

As developers, the less change to the container app, the less impact on customers..

As I mentioned above, that's only true if it doesn't break your app. Frankly, the other justification for not discarding the push server side is that developers already spend a great deal of time complaining about undelivered pushes.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Many thanks for your reply. I think the fundamental issue is what is a late notification when it comes to VoIP.. The answer comes from common sense. Clearly the current scenario where an offgrid device gets an APNS voip push for a call 3 minutes in the past is not correct...

We will make the relevant change in our app on our end but to be honest its a change that clearly should have been done on server....

The CPU time incurred to check an expiration time before sending and throw away if expired in miniscule...

Many thanks for the followup nonetheless

Many thanks for your reply. I think the fundamental issue is what is a late notification when it comes to VoIP.. The answer comes from common sense. Clearly the current scenario where an offgrid device gets an APNS voip push for a call 3 minutes in the past is not correct...

We will make the relevant change in our app on our end but to be honest its a change that clearly should have been done on server....

I think the best solution is exactly what we've done- give your app whatever notification data we get, while also allowing your app to ignore outdated notifications. But, yes, it's unfortunate that it took as long as it has for us to provide that API.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Issue related to APNS is delivering expired voip push notification.
 
 
Q