Skip to content

The ring of Gyges: investigating the future of criminal smart contracts

September 8, 2017

The Ring of Gyges: investigating the future of criminal smart contracts Juels et al, CCS’16

The authors of this paper wrote it out of a concern for the potential abuse of smart contracts for criminal activities. And it does indeed demonstrate a number of ways smart contracts could facilitate crime. It’s also though, another good case study in considerations when designing smart contracts. After all, the stakes are rather high – it’s not just financial considerations but also personal liberty at risk if caught – and the participants certainly may not be trustworthy.

In this paper, we explore the risk of smart contracts fueling new criminal ecosystems. Specifically, we show how what we call criminal smart contracts (CSCs) can facilitate leakage of confidential information, theft of cryptographic keys, and various real-world crimes (murder, arson, terrorism).

Core to the design of the smart contracts (CSCs) is a property the authors call commission-fairness. Commission fairness says that we must guarantee both the commission of a crime and commensurate payment for the perpetrator, or neither. Of course, the fairness issue also applies to many smart contracts in which we want to guarantee either the promised outcome plus payment, or neither. What looks like a fair exchange on the surface can often hide mechanisms that allow a party to a CSC to cheat.

Correct construction of CSCs can thus be quite delicate… As our constructions for commission-fair CSCs show, constructing CSCs is not as straightforward as it might seem, but new cryptographic techniques and new approaches to smart contract design can render them feasible and even practical. … While our focus is on preventing evil, happily the techniques we propose can also be used to create beneficial contracts.

What makes smart contracts interesting for criminal activity?

Decentralized smart contracts have a number of advantages that can benefit criminals:

  • They enable fair exchange between mutually distrustful parties based on contract rules, removing the need for physical rendezvous, third-party intermediaries, and reputation systems.
  • The minimal interaction makes it harder for law enforcement – a criminal can set up a contract and walk away allowing it to execute autonomously with no further interaction.
  • The ability to introduce external state into transactions through an oracle service (e.g., Oraclize) broadens the scope of possible CSCs to include calling-card based physical crimes.

Even before smart contracts, just the pseudonymity offered by cryptocurrencies has been happily embraced by criminals, so there’s every reason to think the same path might be taken with smart contracts.

A major cryptovirological “improvement” to ransomware has been use of Bitcoin, thanks to which CryptoLocker ransomware has purportedly netted hundreds of millions of dollars in ransom.

There are of course questions concerning just how good the anonymity provided by Bitcoin really is, but there are also schemes being developed that improve on this situation. And as the authors point out, “Ron and Shamir provide evidence that the FBI failed to locate most of the Bitcoin holdings of Dread Pirate Roberts (Ross Ulbricht), the operator of the Silk Road, even after seizing his laptop.

Leaking secrets

The first example scenario is payment-incentivised leakage of information. For example, I have access to pre-release Hollywood movies, trade secrets, government secrets, proprietary source code, industrial designs and so on. If you pay me, I’ll reveal the information (in the basic design, to the world, and then in a revision just to the payer).

Darkleaks is such a system built on top of Bitcoin, but it has some limitations and can only serve to leak secrets publicly, it has no facility for fair exchange in a private leakage scenario. With smart contracts, we can do much better…

Some challenges we need to overcome in the design of the contract include:

  • How the seller (contractor, C) proves they have the information
  • How to prevent the seller from delaying release of the information (and collecting their payment at the same time) until after the value of the information has reduced or expired (e.g., a film is publicly released).
  • How to prevent the seller from selectively withholding parts of the information

Take a manuscript M and partition it into n segments, \{m_i\}^{n}_{i = 1}. Consider first the scenario where C wants to sell it for public release. The seller randomly generates a master secret key, msk, and provides a cryptographic commitment c_0 := Enc(pk, msk, r_0) of the master secret key to the contract. (pk is the public key of a Bitcoin account for receiving payment).

  1. The seller now generates unique encryption keys for each segment \mathcal{K}_i := PRF(msk, i) where PRF is a Pseudo-Random Function, and encrypts each segment using the appropriate key. The encrypted segments can be placed in some public location (to avoid storing them in the contract), and their hashes sent to the contract. Interested purchasers / donors can thus download the segments, but can’t yet decrypt them. The contract is initialised with the initial commitment c_0, the segment hashes, and an end time T_end. The contract is now in the CREATED state.
  2. To ensure potential purchasers can trust the leaker, the contract returns a challenge set \Omega, a random subset of the segments. In a response to the challenge the leaker invokes the confirm function providing the set of keys \{\mathcal{K}_i\}_{i \in \Omega} to the contract as well as zero-knowledge proofs that the keys were correctly generated from the master secret key encrypted as c_0. Once the contract has verified these proofs, the contract moves to the CONFIRMED state.
  3. Potential purchasers / donors can now use the revealed secret keys to decrypt the corresponding segments. If they like what they see, they can pledge money to be paid to the leaker on release of the material.
  4. At any time, if the set of pledged donations is satisfactory for the leaker, the leaker can call the accept function decommiting the master secret key. The contract verifies the mks, and if this is successful all donated money is paid to the leaker.
  5. If the end time has elapsed and donations have not been accepted by the leaker, donors can claim their funds back.

To support private leakage (see appendix F.3!) the leaker does not reveal the master secret key directly, but instead provides a ciphertext ct = enc_{pk_p}[msk] to the contract for a purchaser p, along with a proof that ct is correctly formed.

Buying stolen private keys

The second scenario is a contract offering a reward for provision of a private key (e.g., of a victim certificate authority). I’ll go over this one more briefly. An initial attempt at the contract might look something like this:

There are two weaknesses in the above protocol though: it is vulnerable to revoke-and-claim attacks and rushing attacks.

In a revoke-and-claim attack the target CA can revoke its own key and then submit the (now useless) key for payment. So the target of the attack actually ends up profiting from it! The contract is not commission fair in the sense that it doesn’t currently promise to deliver a usable private key.

In a rushing attack when a perpetrator submits a claim, the contractor can learn the private key and then submit their own claim using it. If the contractor can influence ordering (e.g., by attacking the networking layer) they can make their own claim arrive before the valid one.

To defend against revoke and claim attacks, a reward truncation feature is added to the contract. This enables the contract to accept evidence of a key revocation at any point in time. This could be submitted by the contractor, or the contractor could even provide a small reward to a third-party submitter. The payout in the contract for provision of a private key then becomes a function of how long the key remains valid for. It still seems to me that a revoke-and-claim attacker that revokes a key and immediately submits a claim still has a good chance of avoiding the detection window before their claim is processed.

To defend against rushing attacks the claim is separated into two phases. In the first phase a perpetrator expresses an intent to claim by submitting a commitment of the real claim message. Not explicitly stated (that I could see), but I presume this state in the contract ‘locks’ the reward for submitter. In the next round (once the submitter can see that their commitment is accepted) they can open the commitment revealing the claim message, and receive payment.

Here’s the extended key theft contract spec:

The paper also discusses variants where the contract targets multiple victims at once, and where false cover claims can be used to leave victims unsure whether or not a successful claim has been committed.

Calling card crimes

A calling card is an unpredictable feature of a to-be-committed crime – for example, a day, time, and place. Given a contract for an assassination for example, an assassin can specify a day, time, and place. If the victim is indeed killed at that time and location, then the contractor can be reasonably sure the assassin is responsible and release payment.

Calling cards, alongside authenticated data feeds, can support a general framework for a wide variety of CSCs.

As a example, consider a contract for web site defacement. The contractor specifies a website url to be hacked and a statement to be displayed.

We assume a data feed that authenticates website content, i.e., s(t) = (w,url,t), where w is a representation of the webpage content and t is a timestamp, denoted for simplicity in contract time. (For efficiency, w might be a hash of and pointer to the page content.) Such a feed might take the form of, e.g., a digitally signed version of an archive of hacked websites (e.g.,

We can then proceed as follows:


As a proposed counter-measure, the authors suggest making smart contracts trustee neutralizable: an authority, quorum of authorities, or some set of system participants would have the power to remove a contract from the blockchain. This would still preserve anonymity for participants. Whether this would be accepted, I’m really not sure.

We emphasize that smart contracts in distributed cryptocurrencies have numerous promising, legitimate applications and that banning smart contracts would be neither sensible nor, in all likelihood, possible. The urgent open question raised by our work is thus how to create safeguards against the most dangerous abuses of such smart contracts while supporting their many powerful, beneficial applications.

SmartPool: Practical decentralized mining

September 7, 2017

SmartPool: Practical decentralized pooled mining Luu et al., USENIX Security 2017

Say you wanted to implement a mining pool that didn’t place power in the hands of centralized pool operators. If only there was some fully decentralised way of establishing trust and coordinating activities according to a policy, in which anyone could participate… Oh, wait, that sounds just like a smart contract!

In this work, we propose SmartPool, a novel protocol design for a decentralized mining pool. Our protocol shows how one can leverage smart contracts, autonomous blockchain programs, to decentralize cryptocurrency mining.

SmartPool is both a neat idea in its own right, and also a great case study in some of the considerations when designing smart contracts that we looked at last week. If you want to follow along at home, you can find the source code for the SmartPool contract on GitHub.

As part of their evaluation, the authors ran a SmartPool mining pool for a week with a peak hashrate of 30 GH/s, and mined 105 blocks costing miners just 0.6% of block rewards in transaction fees. Compare this to the 3% typically taken by centralized pools. (At the exchange rate on the day I’m writing this, 105 blocks comes out at just shy of $200K in miner rewards.)

Why do we care about decentralized mining pools?

The whole point was to be decentralized, but over time mining pools have come to hold increasing power in the Bitcoin and Ethereum networks: Bitcoin derives 95% of its mining power from just 10 pools, and Ethereum gets 80% of its mining power from only 6 pools. The consolidation of power leads to potential hijacking attacks, and threats to the core security of the Bitcoin consensus protocol if a single pool operator controls more than 51% of the mining power (as has happened in both Ethereum and Bitcoin on several occasions – e.g. DwarfPool and respectively).

Pools can also dictate which transactions get included in the blockchain, increasing the threat of transaction censorship or favouring their own blocks.

For example, users recently publicly speculated that a large Ethereum pool favored its own transactions in its blocks to gain an advantage in a public crowdsale.

And then of course there are the comparatively hefty fees (~3%) to be paid to pool operators.

In Bitcoin, one decentralized mining pool solution does exist, called P2Pool, but as the authors demonstrate it suffers from a lack of scalability and has not been widely adopted.

Challenges for a decentralized mining pool

We consider the problem of building a decentralized protocol which allows a large open network to collectively solve a computational PoW puzzle, and distribute the earned reward between the participants proportional to their computational contributions.

Such a protocol should be decentralised, efficient, secure, and fair.

  • Decentralised – there should be no central operator managing the pool and its participants.
  • Efficient – running costs should be low and the variance and rewards for miners should be comparable to centralized solutions. Overheads and other costs (e.g., gas costs) incurred by participants must be reasonably small
  • Secure – protection from attackers wanting to steal rewards, game the system, or prevent others from joining
  • Fair – participants should receive rewards in proportion to their share of the contributions.

Assume a threat model in which miners are rational and can deviate arbitrarily from the honest protocol to gain more reward.

SmartPool high-level overview

At a high level, SmartPool replaces the mining pool operator with a smart contract. The smart contract acts as a trustless bookkeeper for the pool by storing all shares submitted by miners. When a new share is submitted, the contract verifies the validity of the share, checks that no previous record of the share exists, and then updates the corresponding miner’s record.

Here’s a starter for 10 which shows the basics of the idea:

Unfortunately this version has a number of shortcomings which we’ll need to fix (compare to e.g., the final contract source).

  1. If there are a large number of shares in the pool, then an unwieldy number of messages will be sent to the contract. For example, if it takes 1,00,000 shares to mine a block, that’s 1,00,000 transactions. But reducing the number of shares per block increases the variance in reward for miners, defeating the point of pooling.
  2. The gas fees incurred for submitting shares to the pool might cause miners to end up with negative rewards overall
  3. Transactions are in plaintext, so network adversaries can see the transactions of other miners and either steal or resubmit the shares. (In centralized pools, a secure and private connection to the pool can be assumed, so this issue does not exist).
  4. It’s difficult to use the SmartPool for e.g. Bitcoin mining when the contract is deployed on Ethereum. The contract must guarantee correct payments in Bitcoin, but Ethereum contracts operate only in Ether.

SmartPool is decentralised, efficient, secure, and fair:

  • Decentralised – the pool is implemented as a smart contract
  • Efficient – miners submit shares in batches (e.g., 1 transaction can claim 1 million shares). Miners do not need to submit data for all of the shares, but only a few for verification purposes.
  • Secure – miners must commit the right set of beneficiary addresses in a share before mining, so that it can’t be change after a solution is found.
  • Fair – a probabilistic verification of miner submissions aided by a Merkle-tree based commitment scheme guarantees the same average outcome as running a full verification.

To handle the case of Bitcoin mining on top of Ethereum, SmartPool leverages the Bitcoin coinbase transaction so that miners don’t need to trust a third party to proxy the payment. Miners do still need to pay for gas in Ether. Gas payments should be less than 1% of the total miners’ reward.

SmartPool maintains two main lists in its contract state – claim list claimList and a verified claim list verClaimList.

  • When a miner submits a set of shares as a claim for the current block, it is added to the claim list. This acts as a cryptographic commitment to the shares claimed to be found by the miner.
  • SmartPool verifies the validity of the claim, and once verified, moves it to the verified claim list. Claim verification and payments for verified claims happen in a single atomic transaction.

Submitted claims therefore need to include sufficient metadata for verification purposes. Enabling efficient verification is one of the key contributions of the paper.

The goal of the verification process is to prevent miners from both submitting invalid shares and over-claiming the number of shares they have found.

The claim data structure is a cryptographic commitment to the shares being claimed, based on an augmented Merkle tree structure we’ll look at next. We want to verify that (i) all submitted shares are valid, (ii) no share is repeated twice in a claim, and (iii) each share appears in at most one claim.

For efficiency, SmartPool uses a simple but powerful observation: if we probabilistically verify the claims of a miner, and pay only if no cheating is detected, then expected payoffs of cheating miners are the same or less than those of honest miners. In effect, this observation reduces the effort of verifying millions of shares down to verifying one or two!

Here’s the high level intuition. Consider a cheating miner submitting a claim to 1000 shares, but which contains only 500 valid shares. If SmartPool randomly samples only one share, then the odds of the cheating being detected are 1/2. If the miner is caught cheating, they get paid nothing. If he gets lucky and is not detected, he gets rewarded for 1000 shares. Hence the expected payoff is still 500 – exactly the same as if an honest miner claimed the right number of shares.

By sampling k \geq 1 times, SmartPool can reduce the probability of a cheater remaining undetected exponentially small in k.

The Augmented Merkle tree

The data structure containing the submitted claims that makes all this possible is an augmented Merkle tree. The Merkle tree part of this enables us to commit to a batch of shares in a single claim, the augmented part verifies that the committed set of claims have increasing counters. All shares must include a counter, which must be strictly increasing with each share, and the shares must be included in the tree in increasing counter order, left-to-right.

Here are the rules:

Let ctr be a one-to-one function mapping shares to integers. Then an augmented Merkle tree for a set of objects S = \{s_1, s_2, ..., s_n\} is a tree whose nodes x have the form (min(x), hash(x), max(x)) where:

  • min(x) is the minimum of the children’s min (or ctr(s_i) if x is a leaf corresponding to the object s_i).
  • hash(x) is the cryptographic hash of the concatenation of the children nodes (or hash(s_i) if x is a leaf corresponding to the object s_i)
  • max(x) is the maximum of the children’s min (or ctr(s_i) if x is a leaf corresponding to the object s_i)

A picture is in order! Here I’m using 16-bit hashes for convenience of illustration.

And then after I’d drawn that, I noticed the paper also includes the following figure – but I think mine is prettier!

In the figure above, to prove that share c has been committed, a miner has to submit two nodes d and e to SmartPool. SmartPool can reconstruct the other nodes on the path from c to the root. The proof for one share in a Merkle tree of height h contains h hashes.

Any duplicated shares in a claim will yield a sorting error in at least one path of the augmented Merkle tree.

Thus, by sampling the tree in a constant number of leaves and checking their corresponding paths, with high probability we will detect a sorting error in the augmented Merkle tree if there is one.

To prevent miners from submitting the same shares in two different claims, things are set up so that the maximum counter value in an earlier submission is always smaller than the minimum counter of shares in a later one. For example, the pair (block-timestamp, nonce) can be used as counter in an Ethereum block.

To get a random seed for sampling, the hashes of future blocks can be used.

Here’s the full protocol so far:


Is it fair?

Section 4 contains an analysis proving the following theorem:

Let A be a Merkle tree. If A is sorted, then all paths in A are valid. If A is not sorted, then every leaf containing a sorting error lies on an invalid path.

The corollaries are:

  • Every Merkle tree has at least as many invalid paths as sorting errors among the leaves. In particular, there are at least as many invalid paths as there are duplicate values among the leaves.
  • Under the payment scheme (no payout if caught cheating), if SmartPool checks one random path in the augmented Merkle tree of a claim, the expected reward when submitting invalid or duplicate shares is the same as the expected reward when submitting only valid shares.

An against an adversary;

If an adversary controls less than 50% of Ethereum hash power, then it suffices to sample only two paths of the augmented Merkle tree based on two consecutive blocks to pay miners fairly, on average.

Implementation details

Section 6 details the implementation and the results from a week of mining. There are a few extra important details here not included in the main body of the text:

  • The need to implement the Ethash function in the contract in order to verify that a block header satisfies the required difficulty (this accounts for a lot of the code you see in the source).
  • How to avoid paying excessive costs to store the Ethereum dataset (needed for Ethash) in the contract – storing the full dataset, which changes every four days, would cost about 3M USD/day as of June 2017!
  • How to handle coinbase transactions in Bitcoin.

The appendix discusses the intriguing possibility of a cryptocurrency in which SmartPool is deployed natively as the only mining pool.

REM: Resource-efficient mining for blockchains

September 6, 2017

REM: Resource-efficient mining for blockchains Zhang et al., USENIX Security 2017

The proof-of-work (PoW) used in most blockchains could just as easily be called proof-of-wasted-energy. All that hashing serves no useful end beyond electing the next block in the chain. The combined energy waste is actually pretty staggering:

PoWs serve no useful purpose beyond consensus and incur huge monetary and environmental costs. Today the Bitcoin network uses more electricity than produced by a nuclear reactor, and is projected to consume as much as Denmark by 2020.

As a result other consensus schemes have been explored, including BFT-based and Proof of Stake, but these either “restrict consensus participation or have notable security limitations.” Instead of trying to avoid doing work, in today’s paper the authors suggest that we instead find a way to make the work that is done useful beyond the blockchain. Instead of a Proof of Work, we would therefore have a Proof of Useful Work (PoUW).

In a PoUW system, users can utilize their CPUs for any desired workload, and can simultaneously contribute their work towards securing a blockchain.

The overhead for doing work under the PoUW scheme, vs just running it natively, turns out to be around 5-15%.

At the core of REM (Resource-Efficient Mining) is a reliance on Intel’s SGX, but as we’ll see, there any many parts that have to come together to make the overall scheme work. One key part of the scheme addresses the question of resilience in the face of compromised SGX devices. Let’s look first at the basic execution model, and then we can dive into the security model.

Proof of useful work

Here’s the big picture:

There are blockchain agents, REM miners, and one or more useful work clients. Agents collect transactions and generate a block template, which is a candidate block lacking a PoUW. Miners take a block template and obtain a useful workload from a useful work client, in the form of a PoUW task.

A PoUW task is based on a PoUW SGX enclave together with some input. Starting with any SGX-compliant program (something like SCONE would be possible to run containers inside of SGX), it is easy to turn it into a PoUW enclave by running it through the REM toolchain. A PoUW enclave wraps the provided program to ensure single-threaded execution (needed for security), and to add code to detect whether the work has resulted in a valid block. To help with this, the toolchain also instruments the program so that it accurately tracks the number of executed instructions.

When the PoUW task completes, the enclave has to decide whether it resulted in a valid block.

The PoUW enclave randomly determines whether the work results in a block by treating each instruction as a Bernoulli trial. Thus mining times are distributed in much the same manner as in proof-of-work systems. While in, e.g., Bitcoin, effort is measured in terms of executed hashes, in REM, it is the number of executed useful-work instructions.

To decide whether an instruction is a ‘winner’ the PoUW enclave generates a random number using the SGX random number generator (SRNG) and looks for a value smaller than the target prescribed by the current difficulty level. Doing this process after every instruction would be prohibitively expensive, so instructions are batched into sub-tasks (of say 10 seconds duration) and run the random number generator once, looking for a result less than the target weighted by the number of counted instructions in the batch. Instruction counts are used because unlike CPU cycles, they are not vulnerable to manipulation outside of the enclave. This sub-task batching process is described in §5.1. However, in §5.2 on secure instruction counting the suggestion is that the random draw happens only once at the end of the whole task based on the total instruction count. Either way, the principle is the same.

The security of instruction counting relies on the assumption that once instrumented, the code cannot alter its behavior. To realize this assumption in SGX, we need to require two invariants. First, code pages must be non-writeable; second, the useful work program must be single threaded.

REM checks that code pages in the enclave code ( have W\oplus X permission – this can be verified directly from the ELF program headers. Ensuring single thread access is necessary to prevent multiple threads tampering with the instruction count register value through the shared State Save Area (SSA). To achieve this end, the PoUW enclave instruments the enclave entry point with a spinlock that only allows the first thread to pass. REM also requires a subset of Software Fault Isolation (SFI): indirect control transfers alignment. This prevents jumping to the middle of an instruction to create an alternate execution (as in ROP) to falsify the instruction count.

Our implementation does not include SFI, as off the shelf solutions such as Google’s Native Client could be integrated with the PoUW toolchain and runtime with well quantified overheads.

If the enclave determines that a block has been successfully mined, it produces a PoUW based on SGX-generated attestations: one confirming the enclave’s compliance with REM (the miner ran a valid program), and one confirming the difficulty parameter that the block was mined with.

The blockchain agent concatenates the PoUW to the block template and publishes it to the network. I’ll describe the process by which other agents can verify the block in the next section.

Security model

The attestation model is anchored in Intel’s Attestation Service (IAS) and relies on Intel to behave honestly: signing keys must only be allocated to valid CPUs, and valid nodes should not have their attestations rendered invalid when IAS is queried. The reputational damage to Intel of either of these things not being true is considered sufficient deterrent.

REM uses attestations as proofs for new blocks, so miners need to access IAS to verify blocks. The current way in which IAS works forces miners to access IAS on every single verification, adding an undesirable round-trip time to and from Intel’s server to the block verification time. This overhead, however, is not inherent, and is due only to a particular design choice by Intel.

The simple change is for Intel to include the request it is verifying as part of the report generated in response to an IAS request (currently this is not present). With this change, only one access to IAS would be needed globally for every new block, and other validators could simply check the resulting signed report.

We still have the problem of how validators know that the program run in the enclave was correctly instrumented (wrapped in the PoUW enclave). Putting the programs in the blockchain (a bit like smart contract code) would incur prohibitive overhead (and preclude private code). Alternatively, having a single authority that verifies program compliance would also be an unacceptable centralisation of authority.

To resolve this predicament, we form PoUW attestations with what we call two-layer hierarchical attestations. We hard-code only a single program’s fingerprint into the blockchain, a static-analysis tool called compliance checker. The compliance checker runs in a trusted environment and takes a user-supplied program as input… it calculates the program’s fingerprint and outputs an attestation including this fingerprint.

(The compliance checker verifies the instruction counting instrumentation, the entry point spinlock protection, and that the PoUW runtime is correctly linked and identical to the expected runtime).

A PoUW includes two parts: the useful work program attestation on the mining success, and an attestation from the compliance checker.

Both must be signed by the same CPU, otherwise an attacker that compromises a single CPU could create fake compliance attestations for invalid tasks.

Compromised SGX nodes

This leads on nicely to a major part of the overall REM design, which is protection against compromised SGX chips (it could happen!).

… even if we assume SGX chips are manufactured in a secure fashion, some number of individual instances could be broken by well-resourced adversaries. A single compromised node could be catastrophic to an SGX-based cryptocurrency, allowing an adversary to create blocks at will and perform majority attacks on the blockchain.

Well, we don’t want that do we!

To protect against compromised chips, REM includes a block acceptance policy P_{stat} which adds an extra test on top of the instruction count lottery we’ve seen so far. P_{stat} is based on statistical testing. Starting with an estimate of the rate at which the fastest honest miners (fastest CPU type) can find valid PoUWs, P_{stat} test blocks statistically to determine whether a miner is mining blocks too quickly and thus may be compromised. Of course, an adversary with knowledge of P_{stat} can publish blocks at the fastest rate that it will accept. The algorithm can be tuned ($\alpha$) to trade-off between the amount of waste (falsely rejected blocks) and the advantage that such an adversary can gain.

With \alpha = 0.4 the waste level is 0.6%, and the adversaries advantage is capped at 100.6%.

Integrating REM into the Bitcoin consensus layer

It is important to note that REM is a consensus framework, i.e., a means to generate blocks, not a cryptocurrency. REM can be integrated into a cryptocurrency, as we show by swapping it into the Bitcoin consensus layer.

REM was evaluated with four sample useful work workloads: protein folding, an SVM classifier, zlib compression (iterated), and SHA3-256 hashing (iterated). The running times were compared for native code execution, execution under SGX, and execution under SGX inside a PoUW enclave.

The results show an overhead of 5-15% for converting useful work benchmarks into REM PoUW enclaves.

In summary, our results show that REM is a practically deployable and promising path to fair and environmentally friendly blockchains in partially-decentralized blockchains.

Luck is hard to beat: the difficulty of sports prediction

September 5, 2017

Luck is hard to beat: the difficulty of sports prediction Aoki et al., KDD’17

You can build all the outcome prediction models you like, such as the strategic play model we looked at yesterday, but some events have a certain amount of inherent unpredictability.

There have been empirical and theoretical studies showing that unpredictability cannot be avoided. Indeed, Martin et al., showed that realistic bounds on predicting outcomes in social systems imposes drastic limits on what the best performing models can deliver. And yet, accurate prediction is a holy grail avidly sought in financial markets, sports, arts and entertainment award events, and politics.

In ‘Luck is hard to beat’ the authors try to quantify the amount of inherent unpredictability in sports: i.e., how much of the outcome is due to luck, and how much due to skill? They look at data from 1,503 seasons across a variety of sports from 2007 to 2016. All the sports selected use a round-robin system where every team plays every other team twice: once at home and once away. In total we’re looking at: 42 leagues, and 310 seasons of basketball; 51 leagues and 328 seasons of volleyball; 25 leagues and 234 seasons of handball; and 80 leagues and 631 seasons of soccer.

The first step is the development of a skill coefficient, φ that indicates how far the results of a season (tournament) are from what would be expected if the competition was based on pure luck. Then for competitions that do seem to have a significant skill component, the paper examines how many teams need to be removed from the competition for it to return to an essentially random situation. It took me until quite deep into the paper to get an intuition for what this was really saying. Take an example from the English Premier League: for the seasons between 2007 to 2016 the average number of teams you need to remove is 4.9. Manchester United appeared in that list 9 out of the 10 seasons, and other teams appeared as often as 7 times in that list. So what this is saying is that if you take out a small number of the top clubs (relegation deals with the bottom clubs), then the remaining places are pretty much a random outcome. The final part of the paper looks at a Bayesian model of skills for NBA, one of the sports in which skill is shown to have a larger degree of influence on the outcome. Using the model, we can estimate the likelihood that the underdog (lesser skilled team) wins in a given game: 45% chance when playing at home, and 27% chance when playing away.

Modelling a skill coefficient

… our aim is to define a metric by which performance can be assessed with respect to a baseline where the teams have the same ability or skill and the tournament is determined by pure chance.

The authors build their model up step by step in a nice intuitive way. Let’s begin:

  • Let X_h be a random variable representing the points earned by a team when it plays in a match at home, and X_a be the points earned when it plays away. The distributions will be different for different sports. In football (soccer) for example it’s 3 points for a win, 1 for a draw, and 0 for a loss, regardless of whether you’re playing home or away.
  • Let P_h, P_t, and P_a be the probabilities that a home team wins, draws (ties) or loses respectively.
  • The expected value of X_h is therefore \mu_{X_h} = 3P_h + P_t, and variance is \sigma^{2}_{X_h} = 9P_h + P_t - \mu^{2}_{X_h}. (And the similarly for the expected value of X_a).

Now let’s think about a tournament setting with k+1 teams, where each team will therefore play 2k times. We’ll denote the score earned by a team playing their i-th game at home (away) as X_{hi} (X_{ai}).

  • Y_{2k} = \sum_{i}(X_{hi} + X_{ai}) is therefore the total random score a given team obtains at the end of a tournament.

If all teams had equal skill, after k matches at home and k away matches, the distribution of final scores Y_{2k} would follow approximately a normal distribution with parameters \mu_{2k} = k(\mu_h + \mu_a) and \sigma^{2}_{2k} = k(\sigma^{2}_{h} + \sigma^{2}_{a}). This represents our baseline model, against which the observed empirical results are to be contrasted.

  • For each season, we have the observed value of Y_{2k} for each team (i.e., we know how many points they scored!), so we can calculate the empirical variance s^2.
  • And finally, we can compare the observed variability to the expected value under the equal skills assumption:
    \phi = \dfrac{s^2 - \sigma^{2}_{2k}}{s^2}

Which gives us a skills coefficient \phi in the interval [-\inf, 1]. Values around zero are compatible with seasons following the random model. The closer the value to 1, the more dominant the skill factor. Negative values? Well, they indicate outcomes with a variability much smaller than that expected due to the luck factor. I.e., something peculiar is going on.

The balance between luck and skills in different sports

So after all this, what does the data say? For the sports investigated, basketball is the sport where skill plays the larges part, followed by volleyball, soccer, and then handball.

A possible explanation:

… players usually score many points in basketball and volleyball games. Due to the long sequence of relevant events, it is more difficult for a less skilled team to win a game out of pure luck. However, in soccer and handball there are much less relevant events leading to points… This makes it easier for a less skilled team to win a match due to a single lucky event.

All basketball leagues had a strong skill component in their final results, whereas in soccer luck can be a dominant factor in as many as 7.13% of seasons.

Even in basketball though, skill doesn’t completely determine the final ranking.

In each season, some teams have more extreme skills while the others have approximately the same skill level. It is that first group that drives the \phi coefficient words and close to its maximum value. Where they absent, the season would resemble a lottery.

So how many teams need to be removed from various leagues/seasons in order to make it appear random? I.e., how many sufficiently skills-differentiated teams are there in such leagues?

We saw the EPL example earlier. For the Spanish Primera Division (soccer) the average number of teams you need to remove is 3.2, which included Real Madrid and Barcelona for 9 out of the 10 seasons studied.

Looking at \phi over time we can see that it stays relatively constant by sport:

If we accumulate matches over all seasons in the dataset as if it where one long tournament we can see that for NBA, Serie A, and Primera Division the relative skills of teams are stable over the years, whereas in the Brazilian soccer league teams are more similar to each other in the long term.

A model of skills for NBA

What are the most important factors to explain the difference skills the teams show in a tournament?

The authors create a general Bayesian model based on Bradley-Terry that ends up looking like this:

The skills coefficient for team i of n is given by \mathbf{\alpha_i} , and \log(a_i) = \mathbf{w}^T \mathbf{x}_i . \varepsilon_k is a random effect factor for each match.

For the NBA, considering matches during the regular season (and not the playoffs), the authors fit a Bayesian model with features in \mathbf{x} as follows:

  • A binary indicator indicating East conference or West conferences (the 30 teams are divided into two conferences)
  • The average salary of the top five players
  • The average Player Efficiency Rating (PER) in the last year
  • Team volatility – how much a team changes its players between seasons
  • Roster aggregate coherence – measuring the strength of the relationship between players (how long they have played together)
  • Roster size – the number of players

(They tried several variations of features, these are the ones found to work the best). The following chart shows the correlation between team skills estimated by the model, and the number of games won during the regular season.

Table 2 (below) shows an important result. Based on the estimated skills \alpha_i for each match, we are able to identify which team is the more skilled. Hence… we can estimate the probability that the underdog wins. It is stable over the seasons and approximately equal to 0.36. This is the component of luck in the NBA outcomes.

This turns out to be 0.45 when playing at home, and 0.27 when playing away. (The right-hand columns in the following table show what happens when you remove the top few most skilled teams from the equation).

In conclusion

After all this analysis, there’s a line in the conclusions that comes straight from the Ministry of the Bleeding Obvious: “Looking at the most relevant features, our results suggest a management strategy to maximize the formation of teams with high skills, the controllable aspect of team performance.” Well, duh!

Sports outcomes are a mix of skills and good and bad luck. This mix is responsible for a large part of the sports appeal…. The final message of our paper is that sometimes the culprit is luck, about 35% of the time in NBA. It is hard to beat luck in sports. This makes the joy and cheer of crowds [sic].

“The Leicester City fairytale?”: Utilizing new soccer analytics tools to compare performance in the 15/16 and 16/17 EPL seasons

September 4, 2017

“The Leicester City Fairytale?” : Utilizing new soccer analytics tools to compare performance in the 15/16 and 16/17 EPL seasons Ruiz et al., KDD’17

In England the cricket season is coming to a close and a new football (soccer) season is getting underway. Today’s paper choice is a bit of fun from the recent KDD’17 conference, where the data scientists from look into what can be learned from Leicester City’s winning 2015/16 season. Along the way, they develop a strategic gameplay features model that goes behind the usual shots on target etc., and proves useful in predicting the outcomes of games.

The fairytale of Leicester City winning the English Premier League (EPL) in 2015/16 has been well documented… there were many storylines and explanations that resonated with the footballing community.

Much was made of Leicester City’s counter-attacking play: they tended to cede possession more often (43% possession vs a 55% average), and hit teams on the counterattack. If you look at their expected goal value though (the likelihood of an average player scoring from a given situation) they actually weren’t superior to the other top teams.

So if Leicester City were not more effective than other teams offensively due to their counter-attacking/direct style of play, how did they win the 2015/15 title? Well, there are two reasons…

  1. Absence of normal contenders – in the previous 5 years, each year a champion emerged with significantly better goal scoring effectiveness than the average. In the 2015/16 season, that did not happen.
  2. Leicester City had the most effective defence.

Three factors contributed to their defensive effectiveness: goal-keeping, their defensive strategy, and how effective they were at disrupting passes. Let’s quickly cover the goal-keeping and pass interception stats so that we can focus on the development of the strategy model.


The expected save value measures the likelihood of a goalkeeper making a save from a shot. In the predictive model, shots are described with a collection of features including:

  • Shot location coordinates
  • Goal angle and distance
  • Previous angle and distance (for the ball touch before the shot)
  • Shot type: open play footed, headed, free-kick, penalty.
  • Cutback – binary feature indicating whether the shot was preceded by a cutback pass
  • First touch – binary feature to indicate whether the shot was a first-touch ball contact.

Three classifiers are trained on a dataset of all shots from the five most recent EPL seasons. We can now see how well an individual goalkeeper performs compared to average (predicted) outcome. The analysis shows that the Leicester keeper, Kasper Schmeichel, accounted for 4.6 of the 10.7 goal differential for Leicester.

Passing analysis

With current machine learning techniques, we can measure the difficulty of a pass – which can give us an indication on whether a defence is ‘forcing’ an attacker to attempt a more difficult pass than necessary or turning low probability transition situations into high probability transition situations such as a 50-50 challenge.

A pass difficulty model was trained on 480K labelled examples (pass made = 1, pass not made = 0). The passing model was based on prior work (e.g., ‘Beyond completion rate: evaluating the passing ability of footballers‘). In the following plot, based on “Hinton diagrams” the size of the square represents number of passes in each band, and the intensity of the colour represents the interception rate. Passes are broken into 5 bands based on proportion of passes completed according to passing difficulty (100% is the easiest pass).

… Leicester City exceled (ranking 1st) at regaining the ball for passes in the 21-40%, 41-60%, and 61-80% bands. At an individual level, N’Golo Kanté was the highlight of the team; he was the midfield player with the highest interception rate for passes in the ` The interpretation of a strategy plot is quite simple – the size of the square corresponds to number of shots (relative to the maximum value observed in the league for that category) and intensity of color corresponds to the effectiveness of each method with respect to expected goals (light red = low expectation, dark red = high).

So you can see for example that Leicester conceded a comparatively large number of shots via direct play, but were effective in dealing with them.

Here’s a strategy plot comparing all the teams in the league for the 2015/16, and their attacking and defending efficiency across shot types:

What immediately jumps out is that Leicester City had a lot of counter-attacks and were very effective using this method of scoring. They were also the team with the most penalties, 13 throughout the season.

Using strategic features for prediction

Can the strategic play features be useful for outcome prediction? The authors built four regression models to predict the number of shots and goals that a given team will score in a game. The models use increasing feature sets to see what impact adding in strategic play features has. In model 1 only the average number of shots and goals in the training dataset is available. Model 2 adds a binary input indicating home or away. Model 3 adds 14 inputs, one for each shot type indicating the number of types of each shot the team we are predicting for has made, and Model 4 adds another 14 inputs representing the average proportion of shots of each type taken by the opposition.

… the largest performance bump comes when we incorporate the strategy features for the team of interest, both in terms of number of goals and shots. Adding the style information of the opposition boosts the model once again. This confirms that the strategy plots and the data powering them are not just a description of teams’ preferences and habits, but that they are also correlated with their success or lack thereof on the pitch.

Extending this approach, a recommender was built to estimate the expected number of shots by type that teams will generate in an upcoming game.

Here we see predictions for Leicester City playing away against Liverpool (a tough opponent). Blue bars are the pre-match predictions, green bars are the actual match results.

And these are the predictions for a home game Leicester played a week later at home against weaker opposition:

The analysis tool we have used in this case study provide insightful information about strategy and efficiency in soccer. Furthermore…, their descriptive nature has a predictive byproduct that allows for their use as the core of a ‘recommendation engine’ that can support professionals in the challenging task of pre-game preparation.

Step by step towards creating a safe smart contract: lessons from a cryptocurrency lab

September 1, 2017

Step by step towards creating a safe smart contract: lessons and insights from a cryptocurrency lab Delmolino et al., 2015.

This is an experience report from teaching a smart contract programming course to undergraduates at the University of Maryland, back in the Fall of 2014. Of course that’s a very long time ago in the world of blockchains, and the paper shows its age in a few areas such as the use of the Serpent programming language rather than Solidity. Nonetheless, it still contains some good insights on the mindset change needed to start programming smart contracts vs. other kinds of development. From previous editions of The Morning Paper we know to be aware of mishandled exceptions, transaction-ordering dependencies, timestamp dependencies, and re-entrancy issues as well as general concurrent execution concerns. What else do Delmolino et al., have for us?

As we show below, surprisingly, even for a very simple smart contract it is difficult to create it correctly!

During the course students were divided into groups of four. In the first phase they created a smart contract program of their own choice. The contracts were then critiqued by the instructor, TAs, and fellow students, often revealing a number of flaws. In the second phase the students addressed the bugs found and amended their designs.

In contrast to traditional software development where bugs such as buffer overflows are typical, in our lab, we observed bugs and pitfalls that arise due to the unique nature of smart contract programs. Our lab experiences show that even for very simple smart contracts (e.g., a “Rock, Paper, Scissors” game), designing and implementing them correctly was highly non-trivial. This suggests that extra precautions and scrutiny are necessary when programming smart contracts.

In particular, contracts require an ‘economic thinking’ perspective to ensure fairness even when counterparties attempt to cheat in arbitrary ways to maximise their economic gains.

Using the rock, paper, scissors game, the authors illustrate three classes of mistake the students tended to make: leaking money, hiding in plain sight, and misaligned incentives.

Leaking money

Programming smart contracts typically involves encoding complex state machines. Logical errors in encoding state machines were commonly observed. The simplest type of logical error is a contract that leaks money in corner cases.

In the following extract, there are two ways money can be lost. On lines 19-20, if a third player tries to join and sends money, then the money becomes inaccessible. Note that miners control the ordering of transactions in a block, so a player attempting to join as the second player cannot always protect themselves from joining the party too late. Likewise on line 13, if a player sends an amount other than 1000 wei then the money will also be lost.

The following extract fixes these bugs:

Hiding in plain sight.

Another mistake is more subtle: players send their inputs in cleartext. Since transactions are broadcast across the entire cryptocurrency network, a cheating player may wait to see what his opponent chooses before providing his own input.

The solution to this is to use some form of cryptographic commitments – both players submit their commitment in one phase, and the inputs are revealed in a later phase. For example, players could create a random nonce and then send hash(input, nonce) as their commitment. You can see such a scheme at work in the following revised code:

Misaligned incentives.

What if one party waits for the other to open its commitment, and then seeing that it will lose, just abandons the games (does not send any further messages)? This saves the loser the gas cost of sending a transaction to open a commitment that won’t win, but also denies payment to the winner.

This generalizes to a broader question of how to ensure the incentive compatibility of a contract. Can any player profit by deviating from the intended behavior? Does the intended behavior have hidden costs?

To fix this issue, we need to make sure players are incented to do the right thing. For example: introducing a deadline before which the second player must reveal, otherwise the player who revealed first gets the reward. Additionally, players could make a security deposit in the first stage, which they forfeit unless they open their commitments in a timely manner.

The paper also introduces a number of Ethereum-specific gotchas, such as the call-stack limit. The ‘Security Considerations‘ section of the Ethereum docs does a good job today providing a high level overview of many of these: publicly visible state, re-entrancy, gas limits and loops, pitfalls when sending and receiving ether, callstack depth, and the use of tx.origin.

I also found the following ‘Ethereum Contract Security Techniques and Tips‘ document to be very useful. From the introduction:

Smart contract programming requires a different engineering mindset than you may be used to. The cost of failure can be high, and change can be difficult, making it in some ways more similar to hardware programming or financial services programming than web or mobile development. It is therefore not enough to defend against known vulnerabilities. Instead, you will need to learn a new philosophy of development…

If you want to get into smart contract development, it seems you’d better bring your A-game! Getting it right is going to require combinations of skills in verification, concurrency, cryptography, cybersecurity, and game theory as a starter!

Adding concurrency to smart contracts

August 31, 2017

Adding concurrency to smart contracts Dickerson et al., PODC’17

Yesterday we looked at how analogies from concurrent objects could help us understand smart contract behaviour. In today’s paper choice from PODC’17 (which also has one Maurice Herlihy on the author list) we get to borrow some ideas from concurrent objects to increase the concurrency of smart contracts.

Back in 2008 Herlihy & Koskinen published a paper on ‘Transactional boosting: a methodology for highly-concurrent transactional objects.‘ In the context of software transactional memory (STM), transactional boosting showed how to transform highly-concurrent base objects implemented without any notion of transactions into equally concurrent transaction objects that can safely be used with STM. Taking some ideas from transaction boosting, …

… This paper presents a novel way to permit miners and validators to execute smart contracts in parallel, based on techniques adapted from software transactional memory. Miners execute smart contracts speculatively in parallel, allowing non-conflicting contracts to proceed concurrently, and “discovering” a serialized concurrent schedule for a block’s transactions.

Why do we want more concurrency?

When a miner creates a block, includes a sequence of transactions, computing the new state for the transactions’ smart contracts serially, in the order in which they appear in the block. If the block is subsequently successfully appended to the blockchain, that block’s transactions are re-executed by every node to confirm that the state transitions were computed honestly and correctly.

To summarize, a transaction is executed in two contexts: once by miners before attempting to append a block to the blockchain, and many times afterward by validators checking that each block in the blockchain is honest. In both contexts, each block’s transactions are executed sequentially in block order.

Miners are rewarded for blocks that they successfully append to the blockchain, so they have an incentive to increase throughput by parallelizing smart contract executions. But simply executing contracts in parallel without any special precautions won’t work as the contracts may perform conflicting accesses to shared data leading to an inconsistent final state. Validators on the other hand end up performing the vast majority of contract executions and parallel execution here could bring significant benefits if it could be done safely.

Speculative smart contracts

We’ve seen previously that writing bug free smart contracts is hard.

Clearly, even sequential smart contracts must be written with care, and introducing explicit concurrency to contract programming languages would only make the situation worse. We conclude that concurrent smart contract executions must be serializable: indistinguishable, except for execution time, from a sequential execution.

Smart contracts read and modify shared storage, and they are written in Turing-complete languages – so it is impossible in the general case to determine statically whether or not contracts have data conflicts. Instead, contracts can be instrumented to detect synchronization conflicts at runtime, in a manner similar to that done in transaction boosting. Contracts are executed speculatively, and if a conflict does occur at runtime the conflict is resolved either by delaying one contract until the other completes, or rolling back and restarting one of the conflicting executions.

Speculation is controlled by two run-time mechanism, invisible to the programmer, and managed by the virtual-machine: abstract locks and inverse logs.

Storage operations are protected by abstract locks. If two storage operations map to distinct locks, then they must commute. Or to put it another way, operations that don’t commute must be protected by the same lock. It wasn’t clear to me from reading the paper whether this mapping of operations to locks can be performed automatically, or whether it requires human intervention. Before executing the operation, a thread must acquire the associated lock. When the lock is acquired, it records an inverse operation in a log (think undo log), and then proceeds with the operation.

If the action commits, its abstract locks are released and its log is discarded. If the action aborts, the inverse log is replayed, most recent operations first, to undo the effects of that speculative action. When the replay is complete, the actions’ abstract locks are released. The advantage of combining abstract locks with inverse logs is that the virtual machine can support very fine-grained concurrency.

If one contract calls another, a nested speculative action is created.

At the end of this process, the miner will have discovered a concurrent schedule for a block’s transactions, that is equivalent to some sequential schedule, only faster.


So far so good for the miners, but not so great for the validators. The problem is that the validators need to produce the same or an equivalent schedule of execution to that discovered by the miner. The solution is for miners to produce and publish extra information concerning the constraints discovered during execution. Why would miners make this available?

… that block [produced by the miner] may be competing with other blocks produced at the same time, and the miner will be rewarded only if the other miners choose to build on that block. Publishing a block with a parallel validation schedule makes the block more attractive for validation by other miners.

Here’s how it works:

  • Each lock includes a use counter keeping track of the number of times it has been released by a committing action during the construction of the current block.
  • When a speculative action commits, it increments the counters for each of the locks it holds, and registers a lock profile with the VM recording the abstract locks and their counter values.
  • When all the actions have committed, the common schedule can be reconstructed by comparing lock profiles. It is these profiles that the miner includes in the blockchain along with the usual information.

For example, consider three committed speculative actions, A, B, and C. If A and B have no abstract locks in common, they can run concurrently. If an abstract lock has counter value 1 in A’s profile and 2 in C’s profile, then C must be scheduled after A.

Using the algorithm below, a validator can construct a simple fork-join program that deterministically reproduces the miner’s original speculative schedule. Using a work-stealing scheduler, the validator can exploit whatever degree of parallelism it has available.

The validator keeps a thread-local trace of the abstract locks each thread would have acquired. If these traces don’t match the lock profiles provided by the miner the block is rejected.

Is it safe?

Correctness is argued by appeal to the analogy with transactional boosting, where serial equivalence has been proven.

Experimental results

The authors built an implementation based on the JVM for experimental purposes, using the Scala STM library for speculative action execution.

Examples of smart contracts were translated from Solidity into Scala, the modified to use the concurrency libraries. Each function from the Solidy contract is turned into a speculative transaction by wrapping its contents with a ScalaSTM atomic section. Solidity mapping objects are implemented as boosted hashtables, where key values are used to index abstract locks.

The benchmarks are based on Ballot, SimpleAuction, and EtherDoc contracts, as well as a workload mixing all three. The experiments used only three concurrent threads, but this was still sufficient to show a benefit:

The charts below give more detail of the speedups obtained at different conflict levels.

Our proposal for miners only is compatible with current smart contract systems such as Ethereum, but our overall proposal is not, because it requires including scheduling metadata in blocks and incentivizing miners to publish their parallel schedules. It may well be compatible with a future “soft fork” (backward compatible change), a subject for future research.