Chainspace: a sharded smart contracts platform Al-Bassam et al., NDSS’18
Chainspace is a DApp (decentralised application) platform based on smart contracts, designed for higher scalability than is currently achievable with Bitcoin or Ethereum.
Our modest testbed of 60 cores achieves 350 transactions per second, as compared with a peak rate of less than 7 transactions per second for Bitcoin over 6K full nodes. Ethereum currently processes 4 transactions per second, out of a theoretical maximum of 25.
Chainspace manages state in objects, which are sharded across nodes. One very interesting twist is the distinction between procedures that execute a computation, and independent checkers that can verify a computation was carried out correctly. Only one party executes a transaction, the others just need to verify it. That makes for lots to think about as you work through the paper.
Objects, contracts, and transactions
State in Chainspace is stored in objects. Every object has a unique identifier (which can be used to verify its provenance, as we’ll see later). Objects are also immutable, so the only way to ‘update’ state is to create a new object (with a new id).
Contracts define a set of initial objects that are created when the contract is first created within Chainspace. A contract defines a set of types for all the inputs and outputs, a set of procedures, and a checker.
Procedures take as input both object ‘inputs’ and references. An object that is passed to a procedure as an input is consumed by that procedure and marked inactive — it can never be passed to another procedure again. References give read-only access to object state on the chain. The output of a procedure is a set of (new) objects. Within the procedure there may be local parameters and local return values (from calling other contracts). Some of these may contain secret information and require confidentiality. Procedures do not have to be pure functions, and may be randomised, keep state, or have side effects.
Associated with each smart contract c, we define a checker denoted as v. Those checkers are pure functions (i.e., deterministic and have no side-effects), and return a Boolean value. A checker v is defined by a contract, and takes as parameters a procedure p, as well as inputs, outputs, references and locals. Note that checkers do not take any secret local parameters.
A transaction is an atomic application of one or more procedures to active input objects, and possibly some referenced objects, to create some new output objects. “A user client executes all the computations necessary to determine the outputs of one or more procedures forming a transaction, and provides enough evidence to the system to check the validity of the execution and the new objects.”
Once a transaction is accepted in the system it ‘consumes’ the input objects, that become inactive, and brings to life all new output objects that start their life by being active. References on the other hand must be active for the transaction to succeed, and remain active once a transaction has been successfully committed.
It’s all very reminiscent of ownership typing in Rust.
Within Chainspace, a transaction is represented by a sequence of traces of the executions of the procedures that compose it. Traces contain information about the contract, procedures, input objects, references and local parameters, as well as output objects and local returns (but not secret objects or returns). A trace can depend on other traces that came before it (causality), and is valid if its sequence of dependencies is valid and, (i) all inputs and references are valid, (ii) a trace that produces output objects must also consume some input objects, and (iii) all objects passed to the checker must be types defined by the smart contract.
The identifier of a trace is a cryptographic hash of all the information contained within it, and the identifier of an object is derived through the application of a cryptographic hash function to the identifier of the trace that created the object, as well as a unique name assigned to the output object by the procedures creating the trace.
An object identifier is (thus) a high-integrity handle that may be used to authenticate the full history that led to the existence of the object.
It all makes sense in principle, but I found it difficult to get my head around how I would practically write a checker for a procedure. Things clicked for me when I thought of it in terms of Hoare triples: we are given the procedure inputs, and can verify the pre-conditions. Then the procedure itself executes, which can be a black box to us. We’re left with the outputs, which we can verify meet the post-conditions (conditioned on the inputs).
Perhaps an example will also help. Consider a smart voting system with a contract SVote and three types SVote.Token, SVote.Vote, and SVote.Tally. There are three procedures: SVote.createElection, SVote.addVote, and SVote.tally. When add vote is called, it is passed a new vote to add, homomorphically encrypted and signed by the voter. The voter also provides a zero-knowledge proof certifying that her vote is a binary value and that she voted for exactly one option. The checker can assert the correctness of the votes by verifying the associated signatures and zero-knowledge proofs, without over having to learn the clear value of the votes.
Defining smart contract logic as checkers allows Chainspace to support privacy friendly-contracts by design. In such contracts some information in objects is not in the clear, but instead either encrypted using a public key, or committed using a secure commitment scheme. The transaction only contains a valid proof that the logic or invariants of the smart contract procedure were applied correctly or hold respectively, and can take the from of a zero-knowledge proof, or a Succinct Argument of Knowledge (SNARK)… the checker runs the verifier part of the proof or SNARK that validates the invariants of the transactions, without revealing the secrets within the objects to the verifiers.
It’s time to take a brief look under the hood!
A Chainspace network comprises a set of nodes that manage valid objects and ensure only valid transactions get committed. Nodes are organised into shards. Shards partition the set of objects.
Nodes within a shard reach consensus on whether to accept or reject a transaction, whether an object is active or inactive, and whether traces from contracts they know check. Transactions can span shards, so there is voting at the shard level to reach consensus on commit. This requires unanimous agreement across all shards. Nodes in each shard periodically publish a signed hash chain of checkpoints, each block recording evidence of transactions processed in the current epoch.
An honest shard is one with at least 3f+1 nodes, where adversaries control no more than f faulty nodes. Safety and liveness is guaranteed in honest shards. With dishonest shards (adversary controls at least of the nodes) correctness or liveness cannot be guaranteed, but foul play is guaranteed to be detectable.
There are many options for ensuring that concerned nodes in each shard do not reach an inconsistent state for the accepted transactions, such as Nakamoto consensus through proof-of-work, two-phase commit protocols and classical consensus protocols like Paxos. However, these approaches lack in performance, scalability, and/or security. We design an open, scalable and decentralized mechanism to perform Sharded Byzantine Atomic Commit or S-BAC.
S-BAC is a combination of Byzantine agreement (using the ModSmart implementation of PBFT) and atomic commit using a two-phase commit protocol inspired by Gray and Lamport. You can find more details in §IV.C. Here’s a quick visual summary:
The higher levels of Chainspace functionality are implemented using system smart contracts:
Effectively, instantiation of Chainspace is the combination of nodes running the basic S-BAC protocol, as well as a set of system smart contracts providing flexible policies about managing shards, smart contract creation, auditing and accounting.
Since Chainspace is intended to be an open system, there is also a CSCoin smart contract for tracking value. It can be composed with other procedures to enable payments for processing transactions. Shards can advertise that they will only consider actions valid if some value of CSCoin is transferred to their constituent nodes.
It seems much easier for an adversary to overpower one or more shards in a Chainspace system than it does in systems where all nodes are contributing globally. The set of chainspace nodes are divided into shards, and within a shard an adversary just needs to supply/control nodes. I couldn’t see anything in the paper about the rules for new nodes joining shards. If that is also open, then I could flood a shard with my own nodes? If it is not open, then is there a central point of trust for node membership?
In the limitations and future work section, the authors have this to say:
In case one or more shards are malicious, we provide an auditing mechanism for honest nodes in honest shards to detect the inconsistency and to trace the malicious shard. Through the Hash-DAG structure, it is also possible to fully audit the histories of two objects, and to ensure that the validity rules hold jointly — in particular the double-use rules. However, it is not clear how to automatically recover from detecting such an inconsistency…