Protection in programming languages Morris Jr., CACM 1973
This is paper 5/7 on Liskov’s list.
Experienced programmers will attest that hostility is not a necessary precondition for catastrophic interference between programs.
So what can we do to ensure that program modules are appropriately protected and isolated? We still need to allow modules to cooperate and communicate, but ideally we’d like to be able to prove that a given module has various properties without considering the rest of the system. Morris explores a set of mechanisms that provide different kinds of guarantees:
- Procedures as objects (or as we would say today, “functions as first class citizens”).
- Local objects (scoping)
- Memory protection
- Type-based protections
- Seals and Trademarks
- Access keys
Procedures as objects
A procedure already hides details of its implementation from a caller, so why is it necessary that we can also pass procedures as arguments?
…in order to exploit this device to its fullest extent it is useful to make procedures full-fledged objects in the sense that they can be passed as parameters or values of other procedures and assigned to variables. Then, to make a particular object accessible to someone in a restricted way (e.g. read only) we make the object local to a procedure and pass the procedure to them in its stead.
I buy the argument when the procedure offers a limited view on an encapsulated object, but in general if the procedure is just wrapping an object it must give out either a value-object or a reference-object. If it gives a reference object, then we have no more protection than if we passed this directly. If it gives a value-object, then the procedure seems to offer no value over just passing a value-object directly. See ‘Why functional programming matters‘ for some more powerful (imho!) arguments on the value of functions as first-class citizens.
Local objects
It’s desirable to know that an object is local to a particular part of a program – that is, it cannot be accessed by other parts. “Object” here is used to mean for example the address of a variable, not an object as in the OOP sense. To validate such locality we need to check:
- that it is not passed as a parameter to a procedure not defined within the module
- that it is not returned as a result of a procedure called from outside the module
- that it is not assigned to a reference which is not local to the module
Pony and its deny capabilities is a modern theme along these lines.
Memory protection
We’d like to encapsulate a memory reference (e.g. use of a variable) within a module such that only statements inside the module can assign to it or take its contents. Which is very similar to the ‘local objects’ requirement, but I guess implied is the added protection that the memory itself is inaccessible from outside of the module as well.
Given such protection, we can reason about the state of the module using pre-conditions and post-conditions on all of the operations exposed by it.
Types
Now things start to get a bit more interesting…
One of the fundamental ways in which one programmer provides services to another is to implement a representation of a new kind of object…Generally speaking the first programmer writes some subroutines which create, alter, and otherwise deal with the new kinds of objects. These objects are represented by some older kinds.
In 1973 the representations were typically just primitive types with some conventions, for example an “interval” may simply be represented as a pair of integers. Of course this still happens today, especially when programming in a functional style. We hope the recipient of such representations treats them as the module designer intended, but at the same time we open ourselves up to three kinds of ‘mishap’:
- Alteration – the state could be changed without the use of the primitive functions provided for the purpose
- Discovery – the properties of the data might be explored without using the primitive functions
- Impersonation data created outside of the module might be presented to a primitive function expecting it to have certain properties
There are two parties involved in such mishaps: the programmer of the primitive procedures and the user (misuser, actually) of them.
Through the lens of this paper, discovery presents no direct threat, but alteration and impersonation can both expose bugs by invalidating assumed pre-conditions.
Morris’ solution is to allow data to be tagged with a type-tag, “familiar to any implementor of a language with dynamic data types… but being made available to the general user.” Passing data not tagged with the appropriate type-tag should result in a program violation.
These tags have a couple of interesting properties vs static typing:
First, they are ‘cascadable’ : a tagged item may be retained. This feature reflects our belief that representation can be a multilevel affair. Second, they are completely dynamic in the sense that a running program could generate an arbitrarily large number of different type tags. The practical virtues of this generality are not entirely clear (!).
Nowadays we’d probably use domain types.
Seals and Trademarks
A seal provides a way of wrapping an object such that it cannot be seen or tampered with until it is unsealed. A sealed object may safely be passed as a parameter, but its contents cannot be inspected by anyone not in possession of the unseal key. (Morris just uses an integer, ‘which changes every time create_seal is called.’) The seal mechanism provides for authentication (you know its an object you created, if you can unseal it), and access control. Local objects that are sealed remain local even if the seal escapes the module.
Instead of an integer, today I immediately think of digests (it hasn’t been changed), signatures (I created it), and encryption keys (or key-pairs) (it can’t be seen). Though these are more heavyweight constructs than Morris intends I believe.
Trademarks are a more general form of seals that wrap objects in something that testifies to their authenticity (again, implemented using ever-changing serial-numbers in Morris’ world). This is an early nod to the value of signatures:
If the reader doubts the utility of this device, stripped of any access limiting power, we urge him or her to consider how trademarks and other authenticating marks are used in the commercial/bureaucratic sphere. A mark from the Underwriters’ Laboratories attests that a particular device may be used without risk, a fact that a user might discover only by extensive experience. A letter saying “You have been drafted,” is much more likely to be believed if it bears the mark of a draft board….
Morris’ seals and trademarks are lighter-weight constructs than the cryptographic operations I’ve associated them with though. In usage, he seems to envision them as something very similar to type-tags:
We can invent trademarks to attest to these various properties and attach them to objects in any order we like. For example, a complex number represented in the cartesian system might bear four marks: pair, point-in-plane, cartesian, complex. Programs may examine these marks or not as they please: e.g. A routine that prints pairs is unconcerned with all but the pair tag.
Access keys
Now let us example the implications of making the seal operation public and leaving unseal private…. if one puts such a seal on an object and leaves it in a public place, he/she is guaranteed that only the restricted group of programs holding unseal can access it…
Sounds very much like a public/private key pair!