Making smart contracts smarter

Making smart contracts smarter Luu et al., CCS 2016

This is the fourth in a series of papers from the ACM Queue Research for Practice ‘Cryptocurrencies, Blockchains and Smart Contracts‘ selections, in which Luu at al. look at smart contracts in Ethereum. Smart contracts are a really intriguing idea and have generated a lot of interest/excitement, but they also have a number of properties which make them both likely targets for attackers and also hard to get right. Regular readers of The Morning Paper will not be surprised to see our old friend error and exception handling popping up as one of the chief causes of problems again! After scanning 19,366 Ethereum contracts using the OYENTE tool described in the paper, the authors found vulnerabilities in 8,833 of them.

Here’s the plan: after a brief introduction to smart contracts, we’ll discuss what it is that makes them especially attractive targets, followed by a look at typical vulnerabilities. We’ll then finish up by seeing what we can do about the situation to make contracts more secure in the future.

What exactly is a smart contract?

A smart contract is identified by an address (a 160-bit identifier) and its code resides on the blockchain. Users invoke a smart contract in present cryptocurrencies by sending transactions to the contract address. Specifically, if a new transaction is accepted by the blockchain and has a contract address as the recipient, then all participants on the mining network execute the contract code with the current state of the blockchain and the transaction payloads as inputs. The network then agrees on the output and the next state of the contract by participating in a consensus protocol.

In Ethereum, contracts are introduce to the blockchain via special creation transactions. Contracts are essentially functions whose Ethereum Virtual Machine (EVM) bytecode is incorporated in the blockchain as part of the creation transaction. The contracts themselves can be written in higher-level languages and compiled to EVM bytecode. Contract functions are stateful: they have private storage on the blockchain, and can also hold some amount of virtual Ether coins. The private storage is allocated and initialized by running a constructor, subsequent transactions sent to the contract address invoke the anonymous function.

Here’s an example Puzzle contract:

Note the contract state declared on lines 2-6, constructor on lines 8-13, and anonymous transaction function on lines 15-29. A default input variable msg holds the sender, amount of Ether sent to the contract, and any included data as part of the invocation. In this particular contract, if the owner initiates the transaction (line 16) they can extract the current reward value and replace it with some other amount (lines 17-21). Anyone else invoking the transaction can submit a potential solution, and will receive the reward if the solution is accepted (lines 23-29).

All miners execute the transaction, which will incur some computation cost:

Ethereum pays miners some fees proportional to the required computation. Specifically, each instruction in the Ethereum bytecode has a pre-specified amount of gas. When a user sends a transaction to invoke a contract, she has to specify how much gas she is willing to provide for the execution (called gasLimit) as well as the price for each gas unit (called gasPrice). A miner who includes the transaction in his proposed block subsequently receives the transaction fee corresponding to the amount of gas the execution actually burns multiplied by gasPrice.

If the execution costs more than the gasLimit then execution is terminated and the state is restored to the initial state at the start of the function execution. The miner still receives gasLimit compensation though.

Why are smart contracts attractive targets?

Smart contracts have associated value – potentially handling large numbers of coins worth hundreds of dollars apiece. The 8,833 contracts in the first 1,460,000 blocks in the Ethereum network had a total balance of over 3 million Ether (about $30M USD) at the time the paper was written. The infamous attack on ‘TheDAO’ caused a loss of about $60M to TheDAO’s investors.

Smart contract vulnerabilities

So we know that smart contracts have value as attack targets. They also have a combination of features that should make any experienced software developer raise an eyebrow:

  • They execute in permissionless networks which arbitrary participants can join (i.e., under byzantine conditions)
  • Miners and/or callers have meaningful control over the environment in which the transactions execute (which transactions to accept, transaction ordering, setting of block timestamp, manipulation of call stack)
  • All of the above must be reasoned about in an environment which punishes anyone who doesn’t get it right first time – there is no patching mechanism:

There is no way to patch a buggy smart contract, regardless of its popularity or how much money it has, without reversing the blockchain (a formidable task). Therefore, reasoning about the correctness of smart contracts before deployment is critical, as is designing a safe smart contract system.

Note: you can explicitly design versioning/upgrade capabilities into your smart contract code, since contracts can call each other. See e.g., http://ethereum.stackexchange.com/questions/2404/upgradeable-smart-contracts. But it remains the case that the original bytecode associated with the contract is immutable for all time.

The authors discuss four major categories of vulnerabilities in smart contracts: transaction-ordering dependence, timestamp dependence, mishandled exceptions, and reentrancy vulnerability.

Miners can control the order in which transactions are executed, in particular this means that the state of a contract at the time a user submits a transaction may not match the state of the contract at the time the transaction executes if another transaction updates it first (version numbering and optimistic concurrency control anyone?). In the Puzzle example, someone may submit a solution hoping for the big reward, and the owner can nip in with another transaction that replaces it with little or no reward, so benefiting from the solution without paying out.

Some contracts use the block timestamp as a triggering condition or source of randomness (don’t do this!).

Let us recall that when mining a block, a miner has to set the timestamp for the block (Figure 2). Normally, the timestamp is set as the current time of the miner’s local system. However, the miner can vary this value by roughly 900 seconds, while still having other miners accept the block… Thus, the adversary can choose different block timestamps to manipulate the outcome of timestamp-dependent contracts.

Ethereum contracts can call each other, but exceptions in the callee contract may not be propagated to the caller (depending on exact circumstances). “This inconsistent exception propagation policy leads to many cases where exceptions are not handled properly.” An adversary can load the dice by preparing a contract which calls itself 1023 times before calling the target contract. The Ethereum virtual machine has a 1024 call stack depth limit. Filling the call stack in this way means that the next call the target contract makes will fail with an exception. There is no atomicity here – any actions taken in the target contract directly will be preserved, but those taken by the contract it called will not. For example, ownership of a resource may be transferred, with payment being made.

If you had to guess what else might cause problems beyond error and exception handling, concurrency related bugs is always a good option. And indeed that turns out to be the case leading to reentrancy vulnerabilities:

In Ethereum, when a contract calls another, the current execution waits for the call to finish. This can lead to an issue when the recipient of the call makes use of the intermediate state the caller is in. This may not be immediately obvious when writing the contract if possible malicious behavior on the side of the callee is not considered.

This example contract exhibits the vulnerability:

Line 11 sends the current balance to the contract address wishing to withdraw its balance, but the balance is not zeroed until after the call (line 13). The callee contract can call back into withdrawBalance again and make multiple withdrawals in this manner.

Protecting smart contracts

The proposal to defend against transaction ordering is to allow a guard clause (predicate) to be evaluated before transaction execution. If the guard clause evaluates to false the transaction will not execute. Using this, you can roll-your-own optimistic concurrency scheme.

The solution to timestamp dependency is not to depend on timestamps – there are better sources of both randomness and timestamps available. “A practical fix (for the latter) is to translate existing notions of timestamp into block numbers.”

For exception handling, the straightforward solution is to check the return value whenever one contract calls another! If clients upgrade, an even better solution is to propagate exceptions at the level of the EVM from callee to caller and revert the state of the caller if they are not properly handled.

Based on a model of the operational semantics of the Ethereum bytecode (worth the price of admission all by itself, and we don’t even have the space to cover it here at all!), the authors build a verification tool called OYENTE which can symbolically execute contracts and look for vulnerabilities. OYENT is 4,00 lines of Python, and uses Z3 as the solver to decide satisfiability.

What OYENTE discovered

We collected 19,366 smart contracts from the blockchain as of May 5, 2016. These contracts currently hold a total balance of 3,068,654 Ether, or 30 Million US dollars at the time of writing… On an average, a contract has 318.5 Ether, or equivalently 4523 US dollars.

8,833 of the contracts have at least one security issue: 5,411 contracts (27.9%) have mishandled exceptions; 3,056 contracts (15.7%) have transaction-ordering dependencies; 83 contracts have timestamp dependencies, and 340 contracts have reentrancy handling problems – one of which is the infamous TheDAO contract. You can see several examples of found vulnerabilities in section 6 of the paper.