Discretized Streams: Fault Tolerant Stream Computing at Scale - Zaharia et al. 2013 This is the Spark Streaming paper, and it sets out very clearly the problem that Discretized Streams were designed to solve: dealing effectively with faults and stragglers when processing streams in large clusters. This is hard to do in the traditional continuous … Continue reading Discretized Streams: Fault Tolerant Stream Computing at Scale
Category: Distributed Systems
Core distributed systems topics, for example consistency, availability and so on.
Spinning Fast Iterative Dataflows
Spinning Fast Iterative Dataflows - Ewen et al. 2012 Last week we saw how Naiad combines low-latency stream processing with iterative computation, and yesterday we looked in more detail at the Differential Dataflow model for incremental processing (needed for low-latency). The Apache Flink project also combines low-latency stream processing with support for incremental, iterative computation. … Continue reading Spinning Fast Iterative Dataflows
Differential Dataflow
Differential Dataflow - McSherry et al. 2013 The ability to perform complex analyses on [datasets that are constantly being updated] is very valuable; for example, each tweet published on the Twitter social network may supply new information about the community structure of the service’s users, which could be immediately exploited for real-time recommendation services or … Continue reading Differential Dataflow
Heracles: Improving Resource Efficiency at Scale
Heracles: Improving Resource Efficiency at Scale - Lo et al. 2015 Until recently, scaling from Moore’s law provided higher compute per dollar with every server generation, allowing datacenters to scale without raising the cost. However, with several imminent challenges in technology scaling, alternate approaches are needed. Those approaches involve increasing server utilization, which is still … Continue reading Heracles: Improving Resource Efficiency at Scale
Twitter Heron: Stream Processing at Scale
Twitter Heron: Stream Processing at Scale - Kulkarni et al. 2015 It's hard to imagine something more damaging to Apache Storm than this. Having read it through, I'm left with the impression that the paper might as well have been titled "Why Storm Sucks", which coming from Twitter themselves is quite a statement. There's a … Continue reading Twitter Heron: Stream Processing at Scale
Naiad: A Timely Dataflow System
Naiad: A Timely Dataflow System - Murray et al. 2013 Many data processing tasks require low-latency interactive access to results, iterative sub-computations, and consistent intermediate outputs so that sub-computations can be nested and composed. (For example, an) application that performs iterative processing on a real-time data stream, and supports interactive queries on a fresh, consistent … Continue reading Naiad: A Timely Dataflow System
The Drinking Philosophers Problem
The Drinking Philosophers Problem - Chandy & Misra 1984 How could I resist a paper with a title like that! The Drinking Philosophers is referenced in the PowerGraph paper as a solution to the problem of serializable execution in graph-parallel computation. Vertices are philosophers, and edges are forks (drinks). Let's take a look at how … Continue reading The Drinking Philosophers Problem
Detecting Termination of Distributed Computations Using Markers
Detecting Termination of Distributed Computations Using Markers - Misra 1983 There's an intriguing line in the Distributed GraphLab paper that caught my eye: "Termination is evaluated using distributed consensus algorithm described in [Ref]." Today's choice is the paper by Misra in 1983 that describes this distributed termination detection algorithm. The solution is similar in spirit … Continue reading Detecting Termination of Distributed Computations Using Markers
Scalability! But at what COST?
Scalability! But at what COST? - McSherry et al. 2015 With thanks to Felix Cuadrado, @felixcuadrado, for pointing this paper out to me via twitter. Scalability is highly prized, yet it can be a misleading metric when studied in isolation. McSherry et al. study the COST of distributed systems: the Configuration that Outperforms a Single … Continue reading Scalability! But at what COST?
Blogel: A Block-Centric Framework for Distributed Computation on Real-World Graphs
Blogel: A Block-Centric Framework for Distributed Computation on Real-World Graphs - Yan et al. 2014 We've looked at a lot of different Graph-processing systems over the last couple of weeks (onto a new topic next week I promise!), and despite a variety of different implementation and execution models, one thing they all have in common … Continue reading Blogel: A Block-Centric Framework for Distributed Computation on Real-World Graphs