End-to-End Arguments in System Design

End-to-end arguments in system design – Saltzer, Reed, & Clark 1984.

A true classic from 30 years ago. From the abstract:

This paper presents a design principle that helps guide placement of functions among the modules of a distributed computer system. The principle, called the end-to-end argument, suggests that functions placed at low levels of a system may be redundant or of little value when compared with the cost of providing them at that low level.

The end-to-end argument says that many functions in a communication system can only be completely and correctly implemented with the help of the application(s) at the endpoints.

Therefore, providing that questioned function as a feature of the communication system itself is not possible (sometimes an incomplete version may be useful as a performance enhancement). … We call this line of reasoning against low-level function implementation the end-to-end argument.

In the paper the case is made that (application level) encryption, duplicate message detection, message sequencing, guaranteed message delivery, detection of crashes, and delivery receipts are all amenable to the argument.

An example is given of a file transfer program, which, because failures can occur at any step along the way, must either:

(a) “reinforce each of the steps along the way using duplicate copies, time-out and retry, carefully located redundancy for error detection, crash recovery etc. in order to reduce the probability of each of the individual threats to an acceptably small value,” or

(b) Store a checksum with the file. Transfer the file in a straightforward manner without any complications, then read it in from storage at the destination, recalculate the checksum, and send it back to the originator for comparison.

Option (b) is much simpler!

However, it would be simplistic to conclude that the lower levels should play no part in obtaining reliability…. some effort at the lower levels can have a significant impact on application performance. But the key idea here is that the lower levels need not provide ‘perfect’ reliability.

End-to-end arguments can help with layered protocol design and “may be viewed as part of a set of rational principles for organizing such layered systems.”

A follow-on paper 20 years later, “A critical review of ‘End-to-end arguments in system design’” looks at why many of the cited functions did migrate into the communications substrate over time.

For example,

…congestion control is not amenable to end-to-end implementation for the following reasons: First, like routing, congestion is a phenomenon of the network, and since multiple endpoints share the network, it is the network that is responsible for isolating endpoints that offer excessive traffic so that they do not interfere with the ability of the network to provide its service to other endpoints. Second, it is naive in today’s commercial Internet to expect endpoints to behave altruistically, sacrificing the performance that they receive from the network in order to help the network limit congestion.

To determine if the end-to-end arguments are applicable to a certain service,

…it is important to consider what entity is responsible for ensuring that service, and the extent to which that entity can trust other entities to maintain that service.