The Potential Dangers of Causal Consistency and an Explicit Solution

The Potential Dangers of Causal Consistency and an Explicit Solution - Bailis et al. 2012 Yesterday we saw how we could get both better performance and stronger consistency by upgrading from eventual consistency to causal consistency. Are there any downsides? With useful semantics, low latency, partition tolerance, and, recently, a demonstrably efficient architecture, causal consistency … Continue reading The Potential Dangers of Causal Consistency and an Explicit Solution

Probabilistically Bounded Staleness for Practical Partial Quorums

Probabilistically Bounded Staleness for Practical Partial Quorums - Bailis et al. 2012, and Quantifying Eventual Consistency with PBS - Bailis et al. 2014 'Probabilistically Bounded Staleness... ' was the original VLDB '12 paper, and then the authors were invited to submit an extended version to the VLDB Journal ('Quantifying Eventual Consistency...') which was published in … Continue reading Probabilistically Bounded Staleness for Practical Partial Quorums

MillWheel: Fault-Tolerant Stream Processing at Internet Scale

MillWheel: Fault-Tolerant Stream Processing at Internet Scale - Akidau et al. (Google) 2013 Earlier this week we looked at the Google Cloud Dataflow model which is implemented on top of FlumeJava (for batch) and MillWheel (for streaming): We have implemented this model internally in FlumeJava, with MillWheel used as the underlying execution engine for streaming … Continue reading MillWheel: Fault-Tolerant Stream Processing at Internet Scale

Asynchronous Distributed Snapshots for Distributed Dataflows

Asynchronous Distributed Snapshots for Distributed Dataflows - Carbone et al. 2015 The team behind Apache Flink and data Artisans are a smart group of folks. Their recent blog post on High-throughput, low-latency, and exactly-once stream processing with Apache Flink is well worth reading and has a good description of the evolution of streaming architectures, the … Continue reading Asynchronous Distributed Snapshots for Distributed Dataflows

The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing

The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing - Akidau et al. (Google) - 2015 With thanks to William Vambenepe for suggesting this paper via twitter. Google Cloud Dataflow reached GA last week, and the team behind Cloud Dataflow have a paper accepted at VLDB'15 … Continue reading The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing

Lasp: A language for distributed, coordination-free programming

Lasp: A language for distributed, coordination-free programming - Meiklejohn & Van Roy 2015 * Update: fixed typo in Chris' surname above. * With thanks to Colin Barrett for suggesting today's choice, and to Chris Meiklejohn for providing a link to a paywall-free preprint of the paper. Christopher Meiklejohn recently announced he is leaving Basho to … Continue reading Lasp: A language for distributed, coordination-free programming