Asynchronous Distributed Snapshots for Distributed Dataflows

Asynchronous Distributed Snapshots for Distributed Dataflows - Carbone et al. 2015 The team behind Apache Flink and data Artisans are a smart group of folks. Their recent blog post on High-throughput, low-latency, and exactly-once stream processing with Apache Flink is well worth reading and has a good description of the evolution of streaming architectures, the … Continue reading Asynchronous Distributed Snapshots for Distributed Dataflows

The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing

The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing - Akidau et al. (Google) - 2015 With thanks to William Vambenepe for suggesting this paper via twitter. Google Cloud Dataflow reached GA last week, and the team behind Cloud Dataflow have a paper accepted at VLDB'15 … Continue reading The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing

Lasp: A language for distributed, coordination-free programming

Lasp: A language for distributed, coordination-free programming - Meiklejohn & Van Roy 2015 * Update: fixed typo in Chris' surname above. * With thanks to Colin Barrett for suggesting today's choice, and to Chris Meiklejohn for providing a link to a paywall-free preprint of the paper. Christopher Meiklejohn recently announced he is leaving Basho to … Continue reading Lasp: A language for distributed, coordination-free programming

PerfBlower: Quickly Detecting Memory-Related Performance Problems via Amplification

PerfBlower: Quickly Detecting Memory-Related Performance Problems via Amplification - Fang et al. 2015 Another ECOOP '15 paper, and definitely something with immediate pragmatic utility. PerfBlower finds heap-related performance problems during regular test runs (not exhaustive performance tests) by amplifying the effects of small issues to make them visible. The user provides details of classes of … Continue reading PerfBlower: Quickly Detecting Memory-Related Performance Problems via Amplification

Streams à la carte: Extensible pipelines with object algebras

Streams à la carte: Extensible pipelines with object algebras - Biboudis et al. 2015 Streaming APIs are popping up everywhere, allowing the programmer to express streaming computations such as: int sum = IntStream.of(v) .filter(x -> x % 2 == 0) .map(x -> x * x) .sum(); On examining the streaming libraries in Java, Scala, and … Continue reading Streams à la carte: Extensible pipelines with object algebras

Cloud Types for Eventual Consistency

Cloud Types for Eventual Consistency - Burckhardt et al. 2012 Providing good programming abstractions for cloud storage, synchronization, and disconnected operation appears crucial to accelerate the production of useful and novel applications on today’s and tomorrow’s mobile devices. This paper proposes a model based on cloud types (which may be integrated with a programming language). … Continue reading Cloud Types for Eventual Consistency

Access Rights Analysis in the Presence of Subjects

Access Rights Analysis in the Presence of Subjects - Centonze et al. 2015 Security in application code is a cross-cutting concern and hence very difficult to get right since the analysis often depends on non-local effects. Java and the .NET CLR both have a declarative permissions model that can grant permissions both to code, and … Continue reading Access Rights Analysis in the Presence of Subjects