Cloud computing simplified: a Berkeley view on serverless computing

March 13, 2019March 9, 2019 ~ adriancolyer ~ 8 Comments

Cloud programming simplified: a Berkeley view on serverless computing Jonas et al., arXiv 2019 With thanks to Eoin Brazil who first pointed this paper out to me via Twitter…. Ten years ago Berkeley released the ‘Berkeley view of cloud computing’ paper, predicting that cloud use would accelerate. Today’s paper choice is billed as its logical … Continue reading Cloud computing simplified: a Berkeley view on serverless computing

Efficient synchronisation of state-based CRDTs

March 11, 2019March 9, 2019 ~ adriancolyer

Efficient synchronisation of state-based CRDTs Enes et al., arXiv’18 CRDTs are a great example of consistency as logical monotonicity. They come in two main variations: operation-based CRDTs send operations to remote replicas using a reliable dissemination layer with exactly-once causal delivery. (If operations are idempotent then at-least-once is ok too). state-based CRDTs exchange information about … Continue reading Efficient synchronisation of state-based CRDTs

A generalised solution to distributed consensus

March 8, 2019March 2, 2019 ~ adriancolyer ~ 12 Comments

A generalised solution to distributed consensus Howard & Mortier, arXiv'19 This is a draft paper that Heidi Howard recently shared with the world via Twitter, and here’s the accompanying blog post. It caught my eye for promising a generalised solution to the consensus problem, and also for using reasoning over immutable state to get there. … Continue reading A generalised solution to distributed consensus

Keeping CALM: when distributed consistency is easy

March 6, 2019February 28, 2019 ~ adriancolyer ~ 20 Comments

Keeping CALM: when distributed consistency is easy Hellerstein & Alvaro, arXiv 2019 The CALM conjecture (and later theorem) was first introduced to the world in a 2010 keynote talk at PODS. Behind its simple formulation there’s a deep lesson to be learned with the power to create ripples through our industry akin to the influence … Continue reading Keeping CALM: when distributed consistency is easy

Fixed it for you: protocol repair using lineage graphs

February 1, 2019February 1, 2019 ~ adriancolyer ~ 2 Comments

Fixed it for you: protocol repair using lineage graphs Oldenburg et al., CIDR'19 This is a cool paper on a number of levels. Firstly, the main result that catches my eye is that it’s possible to build a distributed systems ‘debugger’ that can suggest protocol-level fixes. E.g. say you have a system that sometimes sends … Continue reading Fixed it for you: protocol repair using lineage graphs

BEAT: asynchronous BFT made practical

November 26, 2018November 24, 2018 ~ adriancolyer

BEAT: asynchronous BFT made practical Duan et al., CCS'18 Reaching agreement (consensus) is hard enough, doing it in the presence of active adversaries who can tamper with or destroy your communications is much harder still. That’s the world of Byzantine fault tolerance (BFT). We’ve looked at Practical BFT (PBFT) and HoneyBadger on previous editions of … Continue reading BEAT: asynchronous BFT made practical

ScootR: scaling R dataframes on dataflow systems

November 19, 2018November 18, 2018 ~ adriancolyer

ScootR: scaling R dataframes on dataflow systems Kunft et al., SoCC'18 The language of big data is Java ( / Scala). The languages of data science are Python and R. So what do you do when you want to run your data science analysis over large amounts of data? ...programming languages with rich support for … Continue reading ScootR: scaling R dataframes on dataflow systems

Overload control for scaling WeChat microservices

November 16, 2018November 15, 2018 ~ adriancolyer ~ 9 Comments

Overload control for scaling WeChat microservices Zhou et al., SoCC'18 There are two reasons to love this paper. First off, we get some insights into the backend that powers WeChat; and secondly the authors share the design of the battle hardened overload control system DAGOR that has been in production at WeChat for five years. … Continue reading Overload control for scaling WeChat microservices

Debugging distributed systems with why-across-time provenance

November 12, 2018November 8, 2018 ~ adriancolyer ~ 3 Comments

Debugging distributed systems with why-across-time provenance Whittaker et al., SoCC'18 This value is 17 here, and it shouldn’t be. Why did the get request return 17? Sometimes the simplest questions can be the hardest to answer. As the opening sentence of this paper states: Debugging distributed systems is hard. The kind of why questions we’re … Continue reading Debugging distributed systems with why-across-time provenance

The FuzzyLog: a partially ordered shared log

November 2, 2018October 28, 2018 ~ adriancolyer ~ 1 Comment

The FuzzyLog: a partially ordered shared log Lockerman et al., OSDI'18 If you want to build a distributed system then having a distributed shared log as an abstraction to build upon — one that gives you an agreed upon total order for all events — is such a big help that it’s practically cheating! (See … Continue reading The FuzzyLog: a partially ordered shared log