When my app launches, it makes maybe 9 or so network requests to load initial data. It also reads some data from disc. Sporadically, I'm seeing an issue where some of the network requests succeed, but anything involving reading from disc does not load immediately. I'm able to move around in the app, tap buttons, swap tabs, swipe pages, so my main actor isn't stuck. Other data that don't involve disc reading / writing is also blank. About 2 minutes in, suddenly everything loads (both stuff from disc and stuff from the network), nearly instantly, the way it should have done when the app launched. Server logs show more initial network requests succeed than we can see data loaded in the app, and then about 2 minutes later, there's a flood of the rest of the requests which then succeed. The responses to some of these initial network requests cause us to make other network requests, and the sever sees some of those start right away. However, other consequences of these first requests are to touch the disc (to search for manually-cached data), and anything that is supposed to happen after that does not succeed until the 2 minute mark. But what bothers me is some things in the app which don't touch the disc also seem to have successful network requests. I'm seeing it on an iPhone 14Pro running iOS 18.2.1, with 607 GB of disc space available.
When I take screenshots of the loading screens in my app during the apparent freeze, the clock in the screenshots are right - they reflect the clock at the moment I took the screenshot, but the EXIF data in all dozen or so images shows the exact second 2 minutes later when the server gets the resulting flood of network requests. Screenshots taken after the freeze is over have exif timestamps that match the screenshots, as short as 5 seconds after the freeze ends. The screenshot file names, though sequential, are out of order. for instance, some screenshots from 12:58 have file names numbered after screenshots taken at 12:59. but not all are out of order. This seems like disc contention has spread outside the app, and is impacting the system writing the images to disc.
How do I diagnose a cause for this? How does disc contention affect the networking? I have caching turned off for my network requests. We only have a manual image cache, but I don't know how that would stall the display of data that should fetch and display without attempting to hit the image cache.
This happens maybe a couple of times a day for some people, maybe once every couple of weeks for others, but of course, it never when we're trying to debug it.