Jetsam memory crash during Network framework usage

I'm using Network Framework to transfer files between 2 devices. The "secondary" device sends file requests to the "primary" device, and the primary sends the files back. When the primary gets the request, it responds like this:


do {
    let data = try Data(contentsOf: filePath)
    
    let priSecDataFilePacket = PriSecDataFilePacket(fileName: filename, dataBlob: data)
    
    let jsonData = try JSONEncoder().encode(priSecDataFilePacket)

    let message = NWProtocolFramer.Message(priSecMessageType: PriSecMessageType.priToSecDataFile)

    let context = NWConnection.ContentContext(identifier: "TransferUtility", metadata: [message])

    connection.send(content: encodedJsonToSend, contentContext: context, isComplete: true, completion: .idempotent)
} catch {
    print("\(error)")
}

It works great, even for hundreds of file requests. The problem arises if some files being requested are extremely large, like 600MB. You can see the memory speedometer on the primary quickly ramp up to the yellow zone, at which point iOS kills the app for high memory use, and you see the Jetsam log.

I changed the code to skip JSON encoding the binary file as a test, and that helped a bit, but it still goes too high; the real offender is the step where it loads the 600MB file into the data var:

let data = try Data(contentsOf: filePath)

If I remark out everything else and just leave that one line, I can still see the memory use spike.

As a fix, I'm rewriting this so the secondary requests the file in 5MB chunks by telling the primary a byte range such as "0-5242880" or "5242881-10485760", and then reassembling the chunks on the secondary once they all come in. So far this seems promising, but it's a fair amount of work.

My question: Does Network Framework have a built-in way to stream those bytes straight from disk as it sends them? So that I could send all the data in one single request without having to load the bytes into memory?

Answered by DTS Engineer in 827957022
Written by southbayjt in 775908021
Does Network framework have a built-in way to stream those bytes straight from disk as it sends them?

No. When dealing with a large file like this, you have to adopt a streaming approach. That is:

  1. Read a chunk of the file off the disk.

  2. Send it to connection.

  3. When the send completion handler is called, check whether there’s more data to send. If so, repeat the process from step 1.


It’s weird that you’re JSON encoding the data. If you’re dealing with large files, JSON is needlessly inefficient. That doesn’t change the calculus above — you could make the transfer smaller by skipping the encoding but there can always be a file that’s too big — but it does affect your on-the-wire efficiency. You’re transferring one third more bytes than you need to.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Written by southbayjt in 775908021
Does Network framework have a built-in way to stream those bytes straight from disk as it sends them?

No. When dealing with a large file like this, you have to adopt a streaming approach. That is:

  1. Read a chunk of the file off the disk.

  2. Send it to connection.

  3. When the send completion handler is called, check whether there’s more data to send. If so, repeat the process from step 1.


It’s weird that you’re JSON encoding the data. If you’re dealing with large files, JSON is needlessly inefficient. That doesn’t change the calculus above — you could make the transfer smaller by skipping the encoding but there can always be a file that’s too big — but it does affect your on-the-wire efficiency. You’re transferring one third more bytes than you need to.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

Thanks for the info. So it sounds like chunking is the way to go. Regarding the JSON-encoding, is there an alternate way to send a struct that could be decoded on the other end? Since the struct has the other pieces of data that are needed for describing the binary blob coming over. I have a couple other fields like a unique ID in the struct. I was thinking that one way would be to encode the struct without the binary blob, then determine the size in bytes of the struct, then append the json-encoded struct together with the file blob, then lastly append at the beginning of this new blob a fixed-size (like 20 bytes say) indicating the size of the struct. Then on the receiving side, peel off the first 20 bytes, which tells how many bytes are needed for the struct, then peel off the bytes for the struct, so that can be decoded, and the remaining bytes are the file blob.

I tried this as an alternative:

let priSecDataFilePacket = PriSecDataFilePacket(fileName: filenamePart, startPointer: fileReqPacket.startPointer,
endPointer: fileReqPacket.endPointer, chunkNum: fileReqPacket.chunkNum, totalChunkCount: fileReqPacket.totalChunkCount)
                
let jsonEncoder = JSONEncoder()

let jsonData = try jsonEncoder.encode(priSecDataFilePacket)

let jsonSize = jsonData.count

let jsonSizeAsString = String(jsonSize)

let padded = String(repeating: "0", count: max(0, 10 - jsonSizeAsString.count)) + jsonSizeAsString

guard let jsonSizeAsStringAsData = padded.data(using: .utf8) else { return }

let totalSize = jsonSizeAsStringAsData.count + jsonData.count + fileBlob.count

var dataAmalgam = Data(capacity: totalSize)
dataAmalgam.append(jsonSizeAsStringAsData)
dataAmalgam.append(jsonData)
dataAmalgam.append(fileBlob)

sendJson(encodedJsonToSend: dataAmalgam, passedInPriSecMsgType: .priToSecDataFile)

but this spikes the memory during the send, like the memory is being held onto longer. The chunking way (using 20MB chunks that are json-encoded into the struct) do indeed make a larger data blob like you said, but that doesn't spike the memory.

I also tried PropertyListEncoder instead of JSONEncoder (which I only just learned about tonight), but it has the same issue with spiking the memory. It's almost like Network framework is optimized somehow for a binary blob that is JSONEncoded? (Or maybe this is all a lucky side effect that the JSONEncoding is slower to perform and gives the device time to release the memory before performing the next send? Grabbing at straws here.)

Accepted Answer

There are three things to consider here:

  • The cost of JSON encoding large blobs of binary data

  • Streaming a large file to a network connection

  • Head-of-line blocking

I’ll tackle in each turn.


JSON doesn’t support binary data, which means that your data is being Base64 encoded. That increases the size of the data by a third. You really want to avoid that.

There are lots of ways you might do that, and what you’ve described is reasonable. The key thing is to split the header, all the stuff describing the data, from the data. That way you can send the header without worrying about flow control.


In terms of streaming the data, splitting it up into chunks won’t necessarily help. You have to avoid two things:

  • Holding all of the data in memory.

  • Putting too much data into the connection at once.

For example, code like this:

on sendWholeFile
  read the file                                 -- A
  while the residual is not empty
    split a chunk off the front of the file
    write that to the connection                -- B
  end while
end sendWholeFile

won’t work because:

  • You’re reading all of the file up front (line A)

  • You then write chunks to the connection way faster than it can transfer them (line B), resulting in the connection buffering a second copy of all your data.

You need something like this:

on sendChunk(offset)
  read a chunk from the offset
  write that to the connection
    on completion
      increment the offset
      sendChunk(offset)
end sendChunk

In this approach, in the likely case where you can read data faster than the connection can transfer it, data starts to accumulate in the send buffer associated with the connection. Eventually the send buffer fills up and it stops calling your completion handler, meaning that you stop reading data from the file. Once the connection transfers enough data, it calls your completion handler, you read the next chunk, and things proceed from there.

To see a concrete example of this idea, albeit in a slightly different context, see Handling Flow Copying.


The head-of-line blocking problem crops up when you need to transfer a really large file, like the 600 MB file you mentioned at the start of this thread. If you send the header and then start sending the body, your network connection is completely occupied until you’ve finished sending the body. If you need to send some other data in the meantime, it’ll have to wait until you’ve finished sending the entire body.

Fixing this can be tricky. One option is to use QUIC, where you can send the data in a separate stream. QUIC then takes care of multiplexing that on to the underlying transport, with separate flow control for each stream. Nice!

Of course, QUIC introduces its own complexities, most notably TLS (something we’re talking about right now over on this thread).

The other option is to go back to chunking, and allow higher priority messages to preempt your data chunks on the connection. That’s pretty much how HTTP/2 works.

Share and Enjoy

Quinn “The Eskimo!” @ Developer Technical Support @ Apple
let myEmail = "eskimo" + "1" + "@" + "apple.com"

This is great info, thank you so much! I'm learning a ton here on this thread. I will try this approach.

Jetsam memory crash during Network framework usage
 
 
Q