Hey Quinn,
Again, thank you for your reply, but I'm sorry to say that didn't help much, unfortunately.
I spent several days checking everything that came to my mind regarding this issue, but I haven't be able to fix it this far.
However, I'm now sure that it isn't code-related, since I was able to reproduce the issue on a lightweight project (100-ish lines in the only file that actually does something), by using the latest network APIs (can confirm that it isn't related to the deprecated NSURLConnection APIs at all).
Here are the facts I know about the issue :
- It triggers an NSURLError -1005 "The network connection was lost", even though it received a HTTP 200 server response (requests are also logged server-side, and I'm sure the server response was valid).
- Before the error is actually triggered, we see system error logs such as the ones I quoted in my first post ; they don't necessarily mean that the request will fail though.
- It only happens on iOS 10 devices (tested under iOS 10.2 and iOS 10.1.1), on all kinds of devices (tested on iPhone 7, iPhone SE, iPhone 6s, iPhone 6, iPad mini 3) ; I wasn't able to trigger the error on an iOS 9 device.
- It doesn't happen on other OS devices such as Android, even though the Android app access the exact same endpoints.
- However, it DOES happen when I access the same URL using Safari on iOS 10. Just like on my test app, it is not systematic, but it occasionnaly happens. The same error is logged : "com.apple.WebKit.Networking(CFNetwork)[2381] <Error>: NSURLSessionTask finished with error - code: -1005"
- It only happens on real devices, iOS 10.2 simulator is doing just fine.
- It only happens over Wifi, never over cellular connections.
- It only happens when I'm sending requests with TLS to our server endpoint ; the resource is also accessible with HTTP, and we see no connection error or system logs when sending HTTP requests.
- ATS settings seem to have no effect on the issue.
Here are other WEIRD facts :
- It doesn't happen on every Wifi networks. I can reproduce the issue at work, but can't reproduce it at home. However, several of my co-workers can trigger the issue on their home networks.
- It seems that the error triggers only when the server response data exceeds a certain size, around just over 3500 bytes. If the server response is around 1KB, it still triggers system error logs ("[57] Socket is not connected", "Write request has 0 frame count, 0 byte count", "Write close callback received error: [89] Operation canceled"...), but the requests never fail.
- Actually, setting the HTTP header "Accept-Encoding" to "deflate" on my requests helps, since the server is forced to send a larger response.
- After playing around with the Network Link Conditioner on my device, it turned out that the "In-delay" and "Out-delay" values had a great impact on the occurences of the issue. Actually, after setting the delay values to 50ms, I no longer see any system logs, and the requests are processed just fine. If I set it to 1ms (minimum value), the system logs re-appear, and my requests will eventually start to fail again.
Sorry, that's a lot of information to process, but this issue is driving me nuts, I spent days looking for the root of the problem and searching for a way to fix it.
In the end, it would seem that the only thing I can do about it is implementing a way to "slow down" the requests, but that doesn't sound like the right thing to do...
I don't know if you'll be able to help, but anyway, thanks for reading.
I obfuscated the server address for obvious reasons, but if you want to perform some tests on your end, don't hesitate to ask and I'll drop you a mail.
Here is some additional info on the Error received when a request fail :
(lldb) po error!
Error Domain=NSURLErrorDomain Code=-1005 "The network connection was lost." UserInfo={NSUnderlyingError=0x17005f170 {Error Domain=kCFErrorDomainCFNetwork Code=-1005 "(null)" UserInfo={NSErrorPeerAddressKey=<CFData 0x1700886b0 [0x1a6e09bb8]>{length = 16, capacity = 16, bytes = 0x100201bb3ed2d2d10000000000000000}, _kCFStreamErrorCodeKey=54, _kCFStreamErrorDomainKey=1}}, NSErrorFailingURLStringKey=https://***.***.com:443/api/home, NSErrorFailingURLKey=https://***.***.com:443/api/home, _kCFStreamErrorDomainKey=1, _kCFStreamErrorCodeKey=54, NSLocalizedDescription=The network connection was lost.}
I "TLSTooled" our server to check that TLS setup wasn't at fault :
./TLSTool s_client -connect ***.***.com:443
* input stream did open
* output stream did open
* output stream has space
* protocol: TLS 1.2
* cipher: ECDHE_RSA_WITH_AES_128_GCM_SHA256
* trust result: unspecified
* certificate info:
* 0 + rsaEncryption 2048 sha256-with-rsa-signature '*.***.com'
* 1 + rsaEncryption 2048 sha256-with-rsa-signature 'RapidSSL SHA256 CA - G3'
* 2 + rsaEncryption 2048 sha1-with-rsa-signature 'GeoTrust Global CA'
^C