We're troubleshooting SCK issues. They occur with a relatively small amount of sessions, but lack of context and/or ability to advise the customer on how they could make behavior more predictable and reliable is problematic.
Generally, there is 2 distinct issues which may or may not have the same root cause:
- Failure to establish SCK session. Usually manifests within the app as
SCShareableContent.getWithCompletionHandlercall either never invoking the completion handler, or taking prohibitively long time (we usually give it 3-10 sec before giving up). In the system log it may look like this:
(log omitted - suspecting it triggers the content filter)
Note the 6 seconds delay to completion of fetchShareableContentWithOption (normally it's a 30-40ms operation).
- Sometime, we'd see the stream established, but some minutes (or even hours) into the recording we'd stop receiving frames.
Both scenarios are likely to occur when the disk space is low, with reliable repro of the problem #2 at below 8gb of free space (in that case, we've seen replayd silently dropping the session, without ever notifying the client ... improving API could go a long way there). However, out of recent occurrences, while most have less than 100GB available, we've seen it on machines with as much as 500GB free.
Unfortunately, it's almost never reproducible in dev environment, so we have to rely on diagnostics we're able to collect in the field -- which nothing obvious yet.
I'd like to understand the root cause of both scenarios better and/or how what specific frameworks can cause these behaviors.