ACIDRain: concurrency-related attacks on database backed web applications

ACIDRain: Concurrency-related attacks on database-backed web applications Warszawski & Bailis, SIGMOD’17

Welcome back to a new term of The Morning Paper. To kick things off, we have ‘ACID Rain’ – a terrific paper from SIGMOD’17 that pulls together a number of threads we’ve studied previously: transaction processing, anomalies, and security. What ACIDRain demonstrates is that the concurrency anomalies often ignored in practice as a nuisance / irritant, can easily turn into exploitable vulnerabilities. And that makes them not quite so easy to ignore.

In theory, database transactions protect application data from corruption and integrity violations. In practice, database transactions frequently execute under weak isolation that exposes programs to a range of concurrency anomalies, and programmers may fail to correctly employ transactions.

A great look at the state of concurrency control in practice is ‘Feral concurrency control: an empirical investigation of modern application integrity‘ (Bailis et al. 2015). For a more theoretical look at isolation levels and anomalies, see ‘A critique of ANSI SQL isolation levels‘ and ‘Generalized isolation level definitions.’ See also the short series of posts I wrote entitled ‘Out of the Fire Swamp‘. The short take away is that anomalies (i.e., concurrency bugs) definitely do happen in real systems.

In an ACIDRain attack, an attacker finds a concurrency anomaly that leads to violation of an important application integrity constraint (e.g., vouchers shouldn’t be spent more than once, the checkout price charged should include the price of all items in a shopping cart), and then deliberately drives concurrent requests designed to trigger and exploit the weakness.

… latent programming errors represent a potential security vulnerability, and the threat of systematic exploit is not theoretical: on March 2nd 2014 the Flexcoin Bitcoin exchange was subject to such a concurrency-related attack.

As a result of the attack, Flexcoin had to shut down.

The authors study 12 of the most popular self-hosted eCommerce applications, as deployed on over 2M websites, and find 22 critical ACIDRain attacks allowing attackers to corrupt store inventory, over-spend gift cards, and steal inventory.

Threat model

ACIDRain attacks trigger one of two types of anomalies that could not have arisen in a serial execution of the application:

  1. Level-based isolation anomalies are races due to database-level isolation settings. That is, the database may not support serializability, or may not be configured to do so (i.e., pretty much all deployed databases in the wild).
  2. Scoping isolation anomalies arise when an application programmer does not correctly encapsulate logic using transactions, so that concurrent requests can lead to behaviour that could not have arisen sequentially.

Of course, not all anomalies lead to exploitable vulnerabilities: their impact is application dependent.

We assume that an attacker can only access the web application via concurrent requests against publicly-accessible APIs (e.g., HTTP, REST). That is, to perform an ACIDRain attack, the attacker does not require access to the application server, database server, execution environment, or logs. Our proposed analysis techniques use full knowledge of the database schema
and SQL logs, but, once identified, an attacker can exploit the vulnerabilities we consider here using only programmatic APIs.2 This threat model applies to most Internet sites today.

Attack model

We define an ACIDRain attack on a database-backed web application as an exploit allowing an attacker to elicit undesirable application behavior by issuing concurrent requests to trigger non-serializable access to database-managed state.

Note that although a variety of concurrency-related bugs may exist in an application, here we’re just interest in finding those that impact database state. There are two straightforward conditions needed for an ACIDRain attack to exist against a given application:

  1. The application must exhibit anomalies (either level-based or scoping-based) under concurrent execution that could not have arisen under a serial execution.
  2. These anomalies must lead to violations of application invariants.

Detecting potential anomalies

To find possible attack vectors against a given application, the first challenge therefore is to discover what anomalies may occur during its execution. As anyone who has worked with database-backed web applications can probably attest:

Detecting these anomalies is challenging. Many potential anomalies are never triggered under normal operation due to limited concurrency, rendering simple observation ineffective.

Wishing to avoid language and framework-dependent static analysis techniques, the authors develop a very nifty technique for detecting potential level-based and scope-based anomalies in web apps by analysing logs of (regular) database activity. Unlike normal concurrency analysis that starts with a concrete history, and reasons about the anomalies that may occur in it, here we start with a concrete trace and generalise from there to find possible concurrent interleavings of the operations in the traces. This leads to an infinite set of possible traces, so Warszawski & Bailis introduce the notion of an abstract history representing in finite space all expansions of a given trace.

…we developed a new, cross-platform methodology for detecting potential level-based and scope-based anomalies in web applications by analyzing logs of typical database activity. We call this approach Abstract Anomaly Detection (2AD). Figure 2 (below) shows an overview of the workflow.

This strikes me as an incredibly useful analysis tool even if you’re not interested in crafting (or defending against) ACIDRain attacks. The details are given in section 3 of the paper, and in appendix A. I’ll give a high-level overview here.

The process starts out by driving regular web application behaviour and capturing the database logs that result. From the database logs, the sequence of transactions generated by each API call can be captured. Given a concrete trace, it is generalised to a abstract history that represents the set of all possible expansions of the trace. Abstract histories contain operation nodes, transaction nodes, and API call nodes. Edges represent reads and writes. Here’s an example:

We can simultaneously find both level-based and scope-based non-serializable expansions of the original trace (i.e., witnesses) using the abstract history. Just as cycles among transactions indicate anomalies in traditional formalism for reasoning about concrete histories, cycles of API nodes in the abstract history correspond to potentially anomalous behavior in API calls.

(This notion of asking “of all the possible executions, what are the ways this could go wrong” also puts me in mind of Molly – Lineage-Driven Fault Injection).

Remarkably, non-trivial cycle detection in this lifted form captures any and all possible anomalous expansions of a given concrete history. That is, this theorem states that this method is complete with respect to the trace – if a non-serializable expansion exists, 2AD will find it.

The 2AD process proceeds by selecting pairs of operations in the same API call and looking for non-trivial cycles between them. In the worst case this O(p4) for p operations, but the worst case is rarely reached and the complexity is closer to O(p) in practice. By focusing on specific tables and columns of interest, analysis completed in under 10 seconds for every application analysed.

The witnesses produced at this point assume an application executing without any isolation at all, and thus will probably include false positives. For a chosen isolation level, the constraints on witness histories introduced by that isolation model can be used to refine the set of witnesses produced.

We can in fact capture the entire theory of weak isolation, including common models such as Snapshot Isolation via witness refinement by encoding the corresponding restrictions on witness histories.

For false-positives relating to scope-based anomalies it is in theory possible to model application behaviour, but in practice the authors found it easier simply to attempt to trigger a reported anomaly and then find the program logic preventing it if they couldn’t.

ACIDRain attacks on eCommerce applications

The authors apply a prototype of 2AD to a suite of 12 eCommerce applications. Over 60% of the top 1M eCommerce sites are backed by these platforms, and over 55% of eCommerce sites in total.

We did not censor our selection, but instead selected for prevalence and popularity alone. We report results from every application we tested.

To make things tractable, the team focused on three sets of invariants common across almost all applications in the corpus:

  • the inventory invariant says that a product’s stock must be non-negative, and an item’s final stock should reflect orders placed.
  • the voucher invariant says that vouchers should not be used to spend more than their specified limit (overspending).
  • the cart invariant says that the total amount charged for an order should reflect the total value of goods in the order. “While this invariant may seem obvious, we found that in several of the applications it was possible to add an item to the cart concurrent with checkout, resulting in the user paying for the original total of items in the cart, but placing a valid order including the new item as well. This allows users to obtain items for free. For example, a user might buy a pen and add a laptop to their cart during checkout, paying for the pen but placing an order for the pen and laptop.”

After filtering on tables and columns of interest, 2AD returned a median of 37 witnesses per application. Triggering and verifying each attack (e.g., crafting concurrent HTTP requests) took approximately two hours.

Across the 12 applications, 22 vulnerabilities were found: nine inventory vulnerabilities, eight voucher vulnerabilities, and five cart vulnerabilities. Five of these were level-based anomalies, meaning that the default weak isolation level led to them, the remaining 17 were scope-based. Some applications failed to use transactions entirely, others used them ‘sparingly.’

…the prevalence of scope-based vulnerabilities even in the presence of transactions indicates that either programmers, ORMs, or both find it difficult to properly use transactions to encapsulate critical operations.

For the level-based vulnerabilities, the following table shows how many would exist using a variety of popular databases in their default and in their maximum isolation configurations:

Vulnerabilities were reporting to the application developers, yielding a variety of mixed responses showing differing levels of understanding (see section 4.7).

We believe that the magnitude and the prevalence of these vulnerabilities to ACIDRain attacks merits a broader reconsideration of the success of the transaction concept as employed by programmers today, in addition to further pursuit of research in this direction. Based on our early experiences both performing ACIDRain attacks on self-hosted applications as well as engaging with developers, we believe there is considerable work to be done in raising awareness of these attacks—for example, via improved analyses and additional 2AD refinement rules (including analysis of source code to better highlight sources of error)—and in automated methods for defending against these attacks—for example, by synthesizing repairs such as automated isolation level tuning and selective application of SELECT FOR UPDATE mechanisms. Our results here—as well as existing instances of ACIDRain attacks in the wild—suggest there is considerable value at stake.