Paxos Made Live - An Engineering Perspective - Chandra et. al 2007 This is the fourth paper in a ten-part series on consensus. Yesterday we looked at Paxos Made Simple, today we hear from the the team at Google that implemented Paxos at the core of Chubby. The paper reminds of the following Yogi Berra … Continue reading Paxos Made Live
Tag: Distributed Systems
Core distributed systems topics, for example consistency, availability and so on.
Paxos made simple
Paxos made simple - Lamport 2001 This is part 3 of a 10 part series on consenus. Yesterday we looked at The Part-Time Parliament, Lamport's first paper introducing the Paxos algorithm, which takes an allegorical form. In today's choice, Lamport abandons the allegory and puts across the Paxos algorithm in plain english. The Paxos algorithm … Continue reading Paxos made simple
The Part-Time Parliament
The Part-Time Parliament - Lamport '90/'98 This is part 2 of a 10-part series on consensus. There's quite the back story to this paper. First submitted in 1990, researchers at the time didn't seem to take it seriously due to its presentation as an allegory, and failed to appreciate the fundamental contribution that we know … Continue reading The Part-Time Parliament
Viewstamped replication: A new primary copy method to support highly available distributed systems
Viewstamped replication: A new primary copy method to support highly available distributed systems - Oki & Liskov '88. Given a set of co-operating nodes that form a group, how can we replicate information to group members and maintain a consistent "one copy serializability" property as group members come and go? Oki and Liskov introduce two … Continue reading Viewstamped replication: A new primary copy method to support highly available distributed systems
Can’t we all just agree?
(Post updated to add links to write-ups of the papers now that the series is complete). We had to get here at some point! Inspired by the recent publication of Raft Refloated I thought it would be a good time to do a mini-series on consensus. Initially I'd planned out a series of 5 papers … Continue reading Can’t we all just agree?
Edelweiss: Automatic storage reclamation for distributed programming
Edelweiss: Automatic storage reclamation for distributed programming - Conway et al. 2014 This is the final selection from Peter Alvaro is his desert island paper week, and what a great paper to finish on! Please don't let the title of this paper put you off! To be honest, the title didn't really grab my attention … Continue reading Edelweiss: Automatic storage reclamation for distributed programming
Broadcast disks: data management for asymmetric communication environments
Broadcast Disks: Data Management for Asymmetric Communication Environments - Acharya et al. 1997. (This is the fourth of Peter Alvaro's 'desert island paper' selections). Do you remember teletext? Before the web, this was the only on-demand information service for the general population. In the UK, we had the wonderful Ceefax. You would choose your page … Continue reading Broadcast disks: data management for asymmetric communication environments
Knowledge and Common Knowledge in a Distributed Environment
Knowledge and Common Knowledge in a Distributed Environment - Halpern & Moses '90 (initial version 1984). This is the first of five 'Desert island papers' chosen by Peter Alvaro, and what a great choice to kick the week off with. It's a long read, coming in at 36 pages (45 if you include the proofs … Continue reading Knowledge and Common Knowledge in a Distributed Environment
The Chubby lock service for loosely coupled distributed systems
The Chubby lock service for loosely coupled distributed systems - Burrows '06 This paper describes the Chubby lock service at Google, which was designed as a coarse-grained locking service, found use mostly as a name service and configuration repository, and inspired the creation of Zookeeper. [Chubby's] design is based on well-known ideas that have meshed … Continue reading The Chubby lock service for loosely coupled distributed systems
ZooKeeper: wait-free coordination for internet scale systems
ZooKeeper: wait-free coordination for internet scale systems - Hunt et al. (Yahoo!) 2010 Distributed systems would be much simpler if the distributed parts didn't have to coordinate in some fashion. But it's this notion of 'working together' to achieve some aim that differentiates a distributed system from an unrelated bag of parts. Examples of the … Continue reading ZooKeeper: wait-free coordination for internet scale systems