How Much Up-Front? A Grounded Theory of Agile Architecture

How Much Up-Front? A Grounded Theory of Agile Architecture - Waterman et al. 2015 It's time for something a little bit different, so this week I thought I'd bring you a selection of papers from the recently held ICSE'15 conference (International Conference on Software Engineering). To kick things off, today's choice looks at the question … Continue reading How Much Up-Front? A Grounded Theory of Agile Architecture

Discretized Streams: Fault Tolerant Stream Computing at Scale

Discretized Streams: Fault Tolerant Stream Computing at Scale - Zaharia et al. 2013 This is the Spark Streaming paper, and it sets out very clearly the problem that Discretized Streams were designed to solve: dealing effectively with faults and stragglers when processing streams in large clusters. This is hard to do in the traditional continuous … Continue reading Discretized Streams: Fault Tolerant Stream Computing at Scale

Spinning Fast Iterative Dataflows

Spinning Fast Iterative Dataflows - Ewen et al. 2012 Last week we saw how Naiad combines low-latency stream processing with iterative computation, and yesterday we looked in more detail at the Differential Dataflow model for incremental processing (needed for low-latency). The Apache Flink project also combines low-latency stream processing with support for incremental, iterative computation. … Continue reading Spinning Fast Iterative Dataflows

Heracles: Improving Resource Efficiency at Scale

Heracles: Improving Resource Efficiency at Scale - Lo et al. 2015 Until recently, scaling from Moore’s law provided higher compute per dollar with every server generation, allowing datacenters to scale without raising the cost. However, with several imminent challenges in technology scaling, alternate approaches are needed. Those approaches involve increasing server utilization, which is still … Continue reading Heracles: Improving Resource Efficiency at Scale

Naiad: A Timely Dataflow System

Naiad: A Timely Dataflow System - Murray et al. 2013 Many data processing tasks require low-latency interactive access to results, iterative sub-computations, and consistent intermediate outputs so that sub-computations can be nested and composed. (For example, an) application that performs iterative processing on a real-time data stream, and supports interactive queries on a fresh, consistent … Continue reading Naiad: A Timely Dataflow System

A higher order estimate of the optimum checkpoint interval for restart dumps

A higher order estimate of the optimum checkpoint interval for restart dumps - Daly 2004 TL;DR: if you know how long it takes your system to create a checkpoint/snapshot (δ), and you know the expected mean-time between failures (M), then set the checkpoint interval to be √(2δM) - δ. OK, I grant that today's paper … Continue reading A higher order estimate of the optimum checkpoint interval for restart dumps

Detecting Termination of Distributed Computations Using Markers

Detecting Termination of Distributed Computations Using Markers - Misra 1983 There's an intriguing line in the Distributed GraphLab paper that caught my eye: "Termination is evaluated using distributed consensus algorithm described in [Ref]." Today's choice is the paper by Misra in 1983 that describes this distributed termination detection algorithm. The solution is similar in spirit … Continue reading Detecting Termination of Distributed Computations Using Markers