background uploads error w/ NSURLErrorCannotParseResponse when concurrent to http2 servers

I maintain a large production app which uses background-configured URLSession to upload large numbers of files concurrently to our servers using PUT requests. These are file based uploads (https://developer.apple.com/documentation/foundation/urlsession/1411550-uploadtask) as required by background URLSessions to be support. We concurrently upload up to 16 files at once, although the configuration specifies only configuration.httpMaximumConnectionsPerHost = 4

When our servers migrated to Http2, I noticed that users who were uploading concurrently were returned many errors by URLSession. Specially through the delegate method  urlSession(session: task: didCompleteWithError:), the error returned was NSURLErrorCannotParseResponse which has little documentation or public discussion.

When our servers reverted this Http2 change, back to regular Http1.1, this stopped happening. No change in the actual server response was made. Since iOS negotiates the protocol the connection uses without our applications involvement, I cannot seem to choose a network protocol to force http1.1. I also, cannot see any other details from this NSError (no underlying error or other obvious issues). All I can do is log the network protocol via task metrics.

It seems that when this error occurs, its as if the urlsessiond has crashed or the network stack just falls apart and I get many NSURLErrorCannotParseResponse all at once before users can manually retry these failures.

This URLSession is configured by

let configuration = URLSessionConfiguration.background(withIdentifier: backgroundIdentifier)
configuration.waitsForConnectivity = true
configuration.allowsExpensiveNetworkAccess = true
configuration.httpMaximumConnectionsPerHost = 4

This is not a good experience and I have no other way to solve these errors. Does anyone have advice for this scenario?

One reason that NSURLErrorCannotParseResponse could happen is if the response header could not be parsed correctly. I see that you're using PUT requests here, if you send back a 204 No Content you not sending back a response body with the headers are you? While this is possible for a server to do so, it would not be semantically correct and could cause issues.

Matt Eaton
DTS Engineering, CoreOS
meaton3@apple.com

In our case, these responses our definitely status 200's with json body. It doesn't happen with all requests only a minor but important chunk of users whom, once they received NSURLErrorCannotParseResponse they tend to get a lot of them. We attempt to retry these requests and that helps but does not resolve the issue.

I have recently been able to reproduce the issue on iOS Simulator but simulating a transient/dropping Network connection with link conditioner. What I see is also in console reported as..

error 17:51:08.138859-0800 nsurlsessiond Task <86EF8AD5-8532-4D5F-B52F-4462078EA23A>.<525> HTTP load failed, 49316/0 bytes (error code: -1017 [4:-1])

We actually snooped CFNetwork binaries and found this string as _os_log_error_impl(rip - 0x14bb9b, *__os_log_default, 0x10, "%{public}@ HTTP load failed, %lld/%lld bytes (error code: %ld [%ld:%d])", &var_70, 0x3a);

This may be a transient error leaking through a fall through case somewhere specific to background urlsessiond. What is supposed to happen for background URLSession is that transient errors such as lostConnection are never reported back to us and it only uploads when the network connection is good. I fear this is a bug in the network stack which is not recognizing (and retrying or not reporting) a transient error.

An example of URLSessionTaskMetrics from this failing request from a background upload of a 4mb file..

(Request) <NSURLRequest: 0x600002d2d3f0> { URL: https://{private} }

(Response) (null)

(Fetch Start) 2022-02-17 21:09:04 +0000

(Domain Lookup Start) 2022-02-17 21:09:05 +0000

(Domain Lookup End) 2022-02-17 21:09:06 +0000

(Connect Start) 2022-02-17 21:09:06 +0000

(Secure Connection Start) 2022-02-17 21:09:07 +0000

(Secure Connection End) 2022-02-17 21:09:09 +0000

(Connect End) 2022-02-17 21:09:09 +0000

(Request Start) 2022-02-17 21:09:11 +0000

(Request End) 2022-02-17 21:09:44 +0000

(Response Start) (null)

(Response End) (null)

(Protocol Name) h2

(Proxy Connection) NO

(Reused Connection) NO

(Fetch Type) Network Load

(Request Header Bytes) 315

(Request Body Transfer Bytes) 2606217

(Request Body Bytes) 2604768

(Response Header Bytes) 0

(Response Body Transfer Bytes) 0

(Response Body Bytes) 0

(Local Address) 172.25.17.210

(Local Port) 51738

(Remote Address) {private}

(Remote Port) 443

(TLS Protocol Version) 0x0303

(TLS Cipher Suite) 0xC02F

(Cellular) NO

(Expensive) NO

(Constrained) NO

(Multipath) NO

2022-02-17 13:09:53.089075-0800 Cup[81868:1972731] Task <FDDA3872-D8A7-4832-86A7-1F27537851E5>.<821> finished with error [-1017] Error Domain=NSURLErrorDomain Code=-1017 "cannot parse response" UserInfo={_kCFStreamErrorCodeKey=-1, _NSURLErrorFailingURLSessionTaskErrorKey=BackgroundUploadTask <FDDA3872-D8A7-4832-86A7-1F27537851E5>.<821>, _NSURLErrorRelatedURLSessionTaskErrorKey=(

    "BackgroundUploadTask <FDDA3872-D8A7-4832-86A7-1F27537851E5>.<821>",

    "LocalUploadTask <FDDA3872-D8A7-4832-86A7-1F27537851E5>.<821>"

), NSLocalizedDescription=cannot parse response, _kCFStreamErrorDomainKey=4, NSErrorFailingURLStringKey={private}, NSErrorFailingURLKey={private}}

In our case, these responses our definitely status 200's with json body (Response Header Bytes) 0 (Response Body Transfer Bytes) 0 (Response Body Bytes) 0

Yeah, the response bytes being 0 definitely worry me.

Two things here regarding:

We concurrently upload up to 16 files at once, although the configuration specifies only configuration.httpMaximumConnectionsPerHost = 4

If you upload 1 at a time do you see the same issue? If not and everything is good then you should open a bug here to get this looked at further.

Are your network upload passing through a complex network infrastructure? For example, are these request getting lost somehow on the way back? Do they terminate or pass through reverse proxies anywhere?

Matt Eaton
DTS Engineering, CoreOS
meaton3@apple.com

+1 I'm also encountering an NSURLErrorCannotParseResponse error on a seemingly simple GET request. Interestingly, when I setup a proxy (Proxyman), requests from the same device succeed, then start failing again when I turn proxying off. Suggests to me it has something to do with the network connection to our specific server (i.e. the proxy tool is masking an underlying issue).

Has anyone on the thread here had success in resolving/avoiding this error?

I noted the mention of the problem starting when "servers migrated to Http2". We also similarly moved to Http2 when we introduced a caching layer in front our service. Requests direct to the underlying service don't exhibit this issue. I can raise a bug for this, but I'm hoping we can determine a more specific cause and workaround for existing deployed apps.

background uploads error w/ NSURLErrorCannotParseResponse when concurrent to http2 servers
 
 
Q