Table Of Contents  The TCP/IP Guide
 TCP/IP Application Layer Protocols, Services and Applications (OSI Layers 5, 6 and 7)
      TCP/IP Key Applications and Application Protocols
           TCP/IP File and Message Transfer Applications and Protocols (FTP, TFTP, Electronic Mail, USENET, HTTP/WWW, Gopher)
                TCP/IP World Wide Web (WWW, "The Web") and the Hypertext Transfer Protocol (HTTP)
                     TCP/IP Hypertext Transfer Protocol (HTTP)
                          HTTP Entities, Transfers, Coding Methods and Content Management

HTTP Data Length Issues, "Chunked" Transfers and Message Trailers
Two different levels of encodings are used in HTTP: content encodings, which are applied to HTTP entities, and transfer encodings, which are used over entire HTTP messages. Content encodings are used for convenience to package entities for transmission, where transfer encodings are hop-specific, and are intended for use in situations where data needs to be made “safe” for transfer.

However, we’ve already seen in the previous topic that HTTP can transport arbitrary binary data, so unlike the situation where MIME had to make binary data “safe” for RFC 822, this is not an issue. Therefore, why are transport encodings needed at all? In theory they are not, and HTTP/1.0 did not even have a Transfer-Encoding header (though it did use content encodings). The concept of transfer encoding became important in HTTP/1.1 due to another key feature of that version of HTTP: persistent connections.

Recall that HTTP uses the Transmission Control Protocol (TCP) for connections. One of the key characteristics of TCP is that it transmits all data as a stream of unstructured bytes. TCP itself does not provide any way of differentiating between the end of one piece of data and the start of the next; this is left up to each application.

In HTTP/1.0 (and HTTP/0.9) this was not a problem, because those versions used only transitory connections. Each HTTP session consisted of only one request and one response; since client and server only each sent one piece of data, there was no need to worry about differentiating HTTP messages on a connection. HTTP/1.1’s persistent connections improve performance by letting devices send requests and responses one after the other over a single TCP connection. However, the fact that messages are sent in sequence makes differentiating them a concern.

Using The Content-Length Header

There are two usual approaches to dealing with this sort of data length issue: either using an explicit delimiter to mark the end of the message, or including a length header or field to tell the recipient how long each message is. The first approach could not really have been done easily while maintaining compatibility with older versions of the protocol. This left the second approach; since HTTP already had a Content-Length entity header, the solution was to use this to indicate the length of each message at transmission time.

This method works fine in cases where the size of the entity to be transferred is known in advance, such as when a static file such as a text document, image or executable program needs to be transmitted. However, there are many types of resources that are generated dynamically; the total size of such a resource is not known until it has been completely processed. While not typical in HTTP’s early days, these account for a large percentage of World Wide Web traffic today.

Many Web pages are often not static HTML files, but rather are created as output from scripts or programs based on user input; discussion forums would be a good example. Even HTML files today are often not static. They usually contain program elements such as server-side includes (SSIs) that cause code to be generated on-the-fly, so their exact size is cannot be determined in advance.

