Technical Q&A QA1761

Resumable Downloads

Q:  My app downloads large files over HTTP. How can I resume a partially completed download?

A: The answer depends on your target platform:

Resuming an HTTP download isn't too difficult, but you need to understand some critical HTTP concepts:

The basic strategy for resuming a download is:

  1. When you do the initial download, save the entity tag associated the resource.

  2. As you save data to disk, remember how much of the data is valid.

  3. When you come to resume the download, get the entity tag and the amount of data you've saved and apply these values to the request via the Range and If-Range headers.

  4. Execute the request. It will either succeed (and you'll receive the remaining bytes of the resource) or it will fail (in which case you have to get the entire resource from scratch).

If the server doesn't support entity tags you can do similar things with the last-modified date.

For a concrete example of this, you can use a packet trace (see Technical Q&A QA1176, 'Getting a Packet Trace') to look at how Safari resumes a download on your Mac. Listing 1 shows a typical resume request.

Listing 1  An HTTP resume request

GET /[...]/MacOSXUpdCombo10.6.8.dmg HTTP/1.1
User-Agent: Safari/7534.52.7 [...]
Accept: */*
If-Range: "968f3f3e86e0339ce722170ae656bc73:1319461845"
Range: bytes=4041400-
Accept-Language: en-au
Accept-Encoding: gzip, deflate
Connection: keep-alive

The Range header tells the server you want to get the data starting at offset 4041400. The If-Range header tells the server you only want that data if the data hasn't changed since the server gave the client the supplied entity tag (that is, "968f3f3e86e0339ce722170ae656bc73:1319461845").

Listing 2 shows the corresponding response.

Listing 2  An HTTP resume response

HTTP/1.1 206 Partial Content
Server: Apache
Accept-Ranges: bytes
Content-Type: application/octet-stream
Last-Modified: Mon, 24 Oct 2011 13:04:42 GMT
ETag: "968f3f3e86e0339ce722170ae656bc73:1319461845"
Date: Mon, 23 Jan 2012 16:13:25 GMT
Content-Range: bytes 4041400-1087036999/1087037000
Content-Length: 1082995600
Connection: keep-alive

The HTTP status (206) tells you that the response only contains a subset of the resource. The Content-Range header tells you exactly what range of the resource the server is returning (byte 4041400 through to byte 1087036999) and total length of the resource (1087037000). Finally, the Content-Length header tells you how many bytes the server is returning in this particular response.

Document Revision History


New document that explains how to resume HTTP downloads.