Live Caller ID: Multiple userIdentifier values for same device - Expected behavior?

Hello! We're currently testing Live Caller ID implementation and noticed an issue with userIdentifier values in our database.

Initially, we expected to have approximately 100 records (one per user), but the database grew to about 10,000 evaluationKey entries. Upon investigation, we discovered that the userIdentifier (extracted from "User-Identifier" header) for the same device remains constant throughout a day but changes after a few days.

We store these evaluation keys using a composite key pattern "userIdentifier/configHash". All these entries have the same configHash but different userIdentifier values.

This behavior leads to unnecessary database growth as new entries are created for the same users with different userIdentifier values.

Could you please clarify:

  1. Is this the expected behavior for userIdentifier to change over time?
  2. If yes, is there a specific TTL (time-to-live) for userIdentifier?
  3. If this is not intended, could this be a potential iOS bug?

This information would help us optimize our database storage and implement proper cleanup procedures. Thank you for your assistance!

Answered by DTS Engineer in 826582022

Replying to myself:

However, there probably is an upper time boundary above which you could discard the tokens knowing they will never be used again. I'm checking with the engineering team to see what guidance I can provide.

Currently, any evaluation key will not be used again after ~7 days, so my recommendation would be that you add a generous pad to that time window and then discard. In concrete terms, I would something like:

  1. Monitor for evaluation key usage after the expected expiration (say, 8+ days) and “raise an alarm” if you ever see late usage.

  2. Discard data 10 - 14 days after issue (or however much longer you choose).

This approach lets you avoid accumulating data while also letting you detect if/when anything has changed and giving you plenty of time to adjust before anything disruptive happens.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Is this the expected behavior for userIdentifier to change over time?

Yes. As described in "Understanding how Live Caller ID Lookup preserves privacy", one of the major architectural goals is to make it impossible for your server to reliably connect users to their "activity". A big part of that is not providing a stable identifier.

If this is not intended, could this be a potential iOS bug?

As I said, the behavior is absolutely intentional.

If yes, is there a specific TTL (time-to-live) for userIdentifier?

No and maybe. We haven't documented any specific value and I don't think we will, as there is a history of this sort of "exact" behavior being used to attack this type of protocol.

However, there probably is an upper time boundary above which you could discard the tokens knowing they will never be used again. I'm checking with the engineering team to see what guidance I can provide.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Replying to myself:

However, there probably is an upper time boundary above which you could discard the tokens knowing they will never be used again. I'm checking with the engineering team to see what guidance I can provide.

Currently, any evaluation key will not be used again after ~7 days, so my recommendation would be that you add a generous pad to that time window and then discard. In concrete terms, I would something like:

  1. Monitor for evaluation key usage after the expected expiration (say, 8+ days) and “raise an alarm” if you ever see late usage.

  2. Discard data 10 - 14 days after issue (or however much longer you choose).

This approach lets you avoid accumulating data while also letting you detect if/when anything has changed and giving you plenty of time to adjust before anything disruptive happens.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Thank you very much for your detailed response, Kevin. This information is helpful for our implementation.

I have an additional question about our current observations. We launched our app in TestFlight to approximately 100 users, but in just 12-15 hours, our database grew to 11,000 unique evaluation keys with a request rate of ~0.5 RPS to our /key endpoint.

This seems unusual given the 7-day TTL you mentioned. The rapid growth continues now. Do you have any insights on why this might be happening?

While researching, I found a paper (https://arxiv.org/pdf/2406.06761) suggesting that privacy-preserving systems like this might generate fake requests for enhanced security. Could this explain our situation? If so, why would there be such a high volume of these requests?

Thank you again for your assistance.

I have an additional question about our current observations. We launched our app in TestFlight to approximately 100 users, but in just 12-15 hours, our database grew to 11,000 unique evaluation keys with a request rate of ~0.5 RPS to our /key endpoint.

I think something is busted on your end. The expected behavior here is that users will have two keys (one blocking, one identity), rotating at the schedule above. So the expected "peak" here would be ~4 keys per user (when the old/new keys "crossover"). I wouldn't be surprised if there is some corner case that might push the number higher than that, but you're well past that "normal" boundary.

This seems unusual given the 7-day TTL you mentioned. The rapid growth continues now. Do you have any insights on why this might be happening?

I pinged on the the engineers about this, and his first thought was that your /config endpoint might not be returning the correct key status, so clients are thinking that the server does not have a key and uploading a new evaluation key.

One thing you'd probably want to track* is when keys are used over time. You're obviously going to have outliers (user gets key, then spends two weeks off grid as a mountain hermit), but in the case above I suspect you'd see that most of those 11,000 keys are never being used again after a relatively short interval. That would imply that the client thinks the key is "dead" and threw it away "early".

*I'm not sure if I'd do this in production, both for privacy and "scale" reasons. However, you're still at the "make this thing work" stage, where privacy isn't the primary concern.

One broader suggestion I'd have here is to take the time to set up your own dedicated "sandbox" server which you intentionally limit to a VERY small number of user. Particularly in very early development, I'd actually start with EXACTLY 1 user and 1 device, though you would probably broden that out once you were really confident it "worked". A lot of what makes something like this confusing is the disconnect between client side activity and what your server sees. That unavoidable in production (confusing the server is, after all, the entire point of this), but for development/testing purposes you can remove that confusion by simply limiting yourself to exactly one device on the server.

While researching, I found a paper (https://arxiv.org/pdf/2406.06761) suggesting that privacy-preserving systems like this might generate fake requests for enhanced security. Could this explain our situation? If so, why would there be such a high volume of these requests?

No, we're not doing that.

__
Kevin Elliott
DTS Engineer, CoreOS/Hardware

Live Caller ID: Multiple userIdentifier values for same device - Expected behavior?
 
 
Q