Concurrency Control Performance Modeling: Alternatives and Implications – Agrawal et al. 1987
This is part 4 of a 7 part series on (database) ‘Techniques Everyone Should Know.’
Here’s something you can probably relate to: lots of published performance studies, each showing significant advantages for their preferred system/approach, and yet contradicting each other. What’s going on and how can we get to the truth? This was the situation back in 1987 too, when looking at research into concurrency control algorithms for database systems. Research into concurrency control algorithms was blossoming, leading to many studies evaluating their performance…
These performance studies are informative, but the results that have emerged, instead of being definitive, have been very contradictory. For example, studies by Carey and Stonebraker [15] and Agrawal and Dewitt [2] suggest that an algorithm that uses blocking instead of restarts is preferable from a performance viewpoint, but studies by Tay [50, 511 and Balter et al, [6] suggest that restarts lead to better performance than blocking. Optimistic methods outperformed locking in [20], whereas the opposite results were reported in [2 and 15]. In this paper, rather than presenting “yet another algorithm performance study,” we examine the reasons for these apparent contradictions, addressing the models used in past studies and their implications.
For demonstrating the correctness of a concurrency algorithm we have the serializable gold standard. But there is no such standard for demonstrating the performance characteristics of an algorithm. “The result is that nearly every study has been based on its own unique set of assumptions regarding database system resources, transaction behaviour, and other such issues.”
The big lesson from this paper is that those assumptions matter – we need to look closely at what is assumed and test those assumptions to see what lies beneath them. Agrawal et al. build a model that captures the main elements of a database environment including both the users (clients) and physical resources for storing and processing data. “On the basis of this framework, we then show that differences in assumptions explain the apparently contradictory performance results from previous studies.”
The assumptions that previous studies have made and which the authors probe are as follows:
- There are infinite resources
- Restarted transactions are replaced by new independent transactions in a model, rather than running the same transaction over again
- Write-locks are set on items to be updated when they are first read (as opposed to first acquiring a read lock and then only later upgrading it if needed)
The impact of these assumptions are investigated for three concurrency control strategies: blocking, immediate-restart, and optimistic.
- Blocking transactions set read locks on objects that they read, and these locks are later upgraded to write locks for objects that they also write. If a lock request is denied, the requesting transaction is blocked. Deadlock detection will be performed and if there is a deadlock, the youngest transaction in the cycle is aborted. Dynamic two-phase locking is an example of this strategy.
- Immediate-Restart transactions take locks as described for blocking transaction, but if a lock request is denied the requesting transaction is aborted and restarted after a restart delay.
- Optimistic transactions execute unhindered and are validated only after they have reached their commit points. A transaction is restarted at its commit point if it finds that any object that it has read has been written by another transaction that committed during its lifetime.
These strategies represent different extremes in terms of when conflicts are detected, and how they are resolved.
The performance model built by the authors to explore the impact of the assumptions on the strategies contains a database system model capturing the relevant characteristics of the system’s hardware and software including physical resources such as CPUs and disks, a user model that captures the arrival process for users, and a transaction model that captures the reference behaviour and processing requirements of the transactions in the workload.
Central to our simulation model for studying concurrency control algorithm performance is the closed queueing model of a single-site database system… there are a fixed number of terminals from which transactions originate. There is a limit to the number of transactions allowed to be active at any time in the system, the multiprogramming level mpl…
Some of you may be too young to remember the transaction processing world described in this paper. If the string “3270” (it’s pronounced thirty-two seventy) doesn’t mean anything to you, that’s you! This was a time before end users could directly drive transactions over the internet. Instead, Terminals (I’ll use a capital T to distinguish from the ‘terminal windows’ you open on your laptop) provided a screen and keyboard connected to a server. These were used by employees of a business to enter transactions, with a typical interaction being ‘screen’ based rather than character based. Agrawal et al. examine systems with as many as 200 connected Terminals! ;). (The 1977 IBM 3033 mainframe could support up to 17,500 such terminals using only 16MB of memory under CICS). By 1987, instead of using a dedicated physical Terminal, employees may well also be using a terminal emulator running on a PC, but the fundamental model remained.
The infinite resource assumption
Under infinite resources, and in the absence of contention, throughput should be a function of the number of concurrent transactions allowed (the multiprogramming level). But of course the probability of conflicts increases with the amount of concurrency.
When given infinite resources, blocking starts thrashing as the concurrency level increases beyond a certain point whereas throughput continues to increase with the optimistic algorithm. The immediate-restart algorithm reaches a plateau. Blocking gave the lowest response time variance though.
The studies that show the optimistic algorithm is best were right! So were the studies that show that blocking may start thrashing before restarts do.
But what if we challenge the assumption and limit seriously resources? Under these conditions the maximum throughput was obtained with the blocking algorithm, and immediate-restart performed as well or better than the optimistic algorithm.
The studies that show the blocking algorithm is best were right!
By varying the number of resources available the authors were able to demonstrate the cross-over point at which the optimistic algorithm becomes preferable to the blocking one.
Reflecting on the results of the experiments reported in this section, several conclusions are clear. First, a blocking algorithm like dynamic two-phase locking is a better choice than a restart-oriented concurrency control algorithm like the immediate-restart or optimistic algorithms for systems with medium to high levels of resource utilization. On the other hand, if utilizations are suffficiently low, a restart-oriented algorithm becomes a better choice… Second, the past performance studies were not really contradictory after all: they simply obtained different results because of the very different resource modeling assumptions. We obtained results similar to each of the various studies by varying the level of resources that we employed in our database model. Clearly then, a physically justifiable resource model is a critical component for a reasonable concurrency control performance model.
The fake restart assumption
An assumption that has been used for modeling convenience in a number of studies is the fake restart assumption, in which a restarted transaction is assumed to be replaced by a new transaction that is independent of the restarted one…
Upon investigation, the authors found that when resources are plentiful the fake restart assumption produces significantly higher throughputs for the immediate-restart and optimistic algorithms, making them look artificially better. With more limited resources, the fake restart assumption also favours the restart-oriented algorithms at low concurrency levels, whereas at higher concurrency levels the algorithms benefit almost equally.
The immediate write-lock acquisition assumption
What if you don’t allow lock upgrades, so that write locks instead of read locks must be obtained on each item that is to eventually be updated the first time the item is read?
At low concurrency levels this makes little difference, but at higher concurrency levels it does make a difference. The throughput predictions for blocking are significantly lower for blocking under the no-upgrades assumption.
So what’s the best concurrency control algorithm?
The answer of course is that ‘it depends!’
A more general result of this study is that we have reconfirmed results from a number of other studies, including studies reported in [1, 2, 6, 12, 15, 20, 50, and 51]. We have shown that seemingly contradictory performance results, some of which favored blocking algorithms and others of which favored restarts, are not contradictory at all. The studies are all correct within the limits of their assumptions, particularly their assumptions about system resources. Thus, although it is possible to study the effects of data contention and resource contention separately in some models, and although such a separation may be useful in iterative approximation methods for solving concurrency control performance models, it is clear that one cannot select a concurrency control algorithm for a real system on the basis of such a separation – the proper algorithm choice is strong resource dependent. A reasonable model of database system resources is a crucial ingredient for studies in which algorithm selection is the goal.
Or as Peter Bailis puts it in his introduction to chapter 3 of the Redbook:
One property of concurrency control remains a near certainty: there is no unilateral “best” mechanism in concurrency control. The optimal strategy is workload-dependent.
If you like this paper, you may also be interested in “Musketeer: all for one, one for all in data processing systems.”