Granularity of Locks and Degree of Consistency in a Shared Data Base – Gray et al. 1975
This is part 3 of a 7 part series on (database) ‘Techniques Everyone Should Know.’
Today we’ll look at the second part of this paper which introduces the notion of differing degrees of consistency, and how we can reason about them. Gray et al. introduce four degrees of consistency, called simply degrees 0 through 3. Degree 1 we today call Read Uncommited, and Degree 2 corresponds to Read Committed. Degree 3 gives repeatable reads (and later claimed serializability, though there is no discussion of phantoms)… the relationship to the ANSI SQL Isolation levels is complex and discussed in A Critique of ANSI SQL Isolation Levels, Berenson et al. 1995.
Before diving into the levels, it’s worth taking a few moments to examine what ‘consistency’ really means…
The data base consists of entities which are known to be structured in certain ways. This structure is best thought of as assertions about the data… The data base is said to be consistent if it satisfies all its assertions. In some cases, the data base must become temporarily inconsistent in order to transform it to a new consistent state. For example, adding a new employee involves several atomic actions and updating of several fields. To cope with these temporary inconsistencies, sequences of atomic actions are grouped to form transactions. Transactions are the units of consistency. They are larger atomic actions on the data base which transform it from one consistent state to a new consistent state.
Transactions are also the unit of recovery, since if any one action of a transaction fails then the entire transaction is undone so as to return the database to a consistent state. Given atomic transactions, the potential for inconsistencies arises only when we try to get higher throughput by running mutiple transactions concurrently. If we only ran one transaction at a time, then each transaction would see the consistent state left behind by its predecessor. But if we schedule several transactions concurrently, then we need some locking mechanism to ensure that the inputs to each transaction are consistent.
Think of a transaction as a sequence of actions. Any sequence preserving merging of the actions of a set of transactions into a single sequence is called a schedule.
A schedule is a history of the order in which actions are executed (it does not record actions which are undone due to backup). The simplest schedules run all actions of one transaction and then all actions of another transaction… Such one-transaction-at-a-time schedules are called serial because they have no concurrency among transaction. Clearly, a serial schedule has no concurrency induced consistency and no transaction sees dirty data. Locking constrains the set of allowed schedules. In particular, a schedule is legal only if it does not schedule a lock action on an entity for one transaction when that entity is already locked by some other transaction in a conflicting mode.
An output (a write) of a transaction is committed when the transaction abdicates the right to ‘undo’ the write, and makes the new value available to all other transactions. Writes not yet committed by the writer are said to be uncommitted or dirty.
Degree 0
A degree 0 transaction simply promises not to overwrite the dirty data of other transactions. In terms of locking protocols, degree 0 requires obtaining a (possibly short) exclusive lock on any data the transaction dirties. Degree 0 consistent transactions are unrecoverable since they commit outputs before the end of the transaction. However, if all transactions see at least degree 0 consistency, then any transaction at a higher level is recoverable.
A degree 0 consistent transaction is well-formed with respect to its writes (it always obtains an exclusive lock before writing).
Degree 1
Degree 1 consistency combines the promise of degree 0 not to overwrite the dirty data of other transactions with an agreement not to commit any writes before the end of the transaction. This requires setting a long (held until the end of the transaction) exclusive lock on any data it dirties.
A degree 1 consistent transaction is well-formed and two-phase (releases all locks only after all required locks have been obtained) with respect to writes.
Degree 2
With degree 1 consistency a transaction may read uncommitted values which are subsequently updated or undone. Degree 2 consistency isolates a transaction from such uncommitted values. In addition to the requirements for degree 1 consistency, it does not read the dirty data of other transactions. To achieve this it sets a (possibly short) share lock on any data it reads.
Degree 2 consistent transactions are well-formed (i.e., with respect to both reads and writes), and two-phase with respect to writes.
Degree 3
A degree 2 consistent transaction may read two different (both committed) values if it reads the same entity twice. Degree 3 consistency completely isolates the transaction from inconsistencies due to concurrency. A degree 3 consistent transaction takes long exclusive locks on any data it dirties, and long share locks on any data it reads. It is therefore well-formed and two-phase.
Consistency and Schedules
Recall that a schedule is a history of the order in which the actions of transactions are executed.
A transaction T runs at degree 0 (1, 2, or 3) consistency in schedule S if T sees degree 0 (1, 2, or 3) consistency in S. If all transactions run at degree 0 (1, 2, or 3) consistency in schedule S then S is said to be a degree 0 (1, 2, or 3) consistent schedule.
If each transaction observes the degree 0 (1,2, or 3) lock protocol, then any legal schedule is degree 0 (1, 2, or 3) consistent. If some transaction T does not observe the degree 1 (2 or 3) lock protocol then it is still possible to define another transaction T’ observing the protocol such that T and T’ have a legal schedule, but T does not run at degree 1 (2 or 3) consistency in S.
Unless a transaction actually sets the locks prescribed by degree 1 (2 or 3) consistency one can construct transaction mixes and schedules which will cause the transaction to run at (see) a lower degree of consistency. However, in particular cases such transaction mixes may never occur due to the structure or use of the system. In these cases an apparently low degree of locking may actually provide degree 3 consistency….
(This is the same insight that Coordination Avoidance in Database Systems uses to achieve higher concurrency levels).
Implications for Transaction Backup and System Recovery
System-wide degree 1 consistency allows transaction backup and system recovery without lost updates.
The transaction is unrecoverable after its first commit of an update (unlock) and so although degree 1 does not require it, the shrinking phase is usually deferred to the end of transaction so that the transaction is recoverable.
Degree 1 transactions may read uncommitted (dirty) data, and transaction and system recovery may undo uncommitted updates. Therefore they may produce different results when re-run. Degree 2 transactions re-run in the order specified in the log will always result in the same consistent state.