Asynchronous Complex Analytics in a Distributed Dataflow Architecture - Gonzalez et al. 2015 Here's a theme we've seen before: the programming model offered by large scale distributed systems doesn't always lend itself to efficient algorithms for solving certain classes of problems. In today's paper, Gonzalez et al. examine the growing gap between efficient machine learning … Continue reading Asynchronous Complex Analytics in a Distributed Dataflow Architecture
FIT: A Distributed Database Performance Trade-off
FIT: A Distributed Database Performance Trade-off - Faleiro & Abadi, 2015 If the CAP FITs... This paper presents the FIT trade-off for distributed transactions: you can have any two of Fairness, (strong) Isolation, and Throughput, but not all three. Which also implies you can have both strong isolation and high throughput! As a consequence of … Continue reading FIT: A Distributed Database Performance Trade-off
GD-Wheel: A Cost-Aware Replacement Policy for Key-Value Stores
GD-Wheel: A Cost-Aware Replacement Policy for Key-Value Stores - Li & Cox 2013 One of the wonderful things about reading papers and being exposed to lots of different problems and their solutions is that you never know when an idea might resurface and be useful in a new context or challenge you are facing. Yesterday … Continue reading GD-Wheel: A Cost-Aware Replacement Policy for Key-Value Stores
Hashed and Hierarchical Timing Wheels: Data Structures for the Efficient Implementation of a Timer Facility
Hashed and Hierarchical Timing Wheels: Data Structures for the Efficient Implementation of a Timer Facility - Varghese & Lauck 1987 Yashiro Matsuda recently wrote a blog post describing Apache Kafka's use of Hierarchical Timing Wheels to keep track of large numbers of outstanding requests. In the Kafka use case, each request lives in a 'purgatory' … Continue reading Hashed and Hierarchical Timing Wheels: Data Structures for the Efficient Implementation of a Timer Facility
Moving Fast with Software Verification
Moving Fast with Software Verification - Calcagno et al. 2015 This is a story of transporting ideas from recent theoretical research in reasoning about programs into the fast-moving engineering culture of Facebook. The context is that most of the authors landed at Facebook in September of 2013, when we brought the INFER static analyser with … Continue reading Moving Fast with Software Verification
Fail at Scale & Controlling Queue Delay
Controlling Queue Delay - Nichols & Van Jacobsen, 2012, and Fail at Scale - Maurer, 2015 Fail at Scale (Maurer) Ben Maurer recently wrote a great article for ACM Queue on how Facebook achieves reliability in the face of rapid change: To keep Facebook reliable in the face of rapid change we study common patterns … Continue reading Fail at Scale & Controlling Queue Delay
Minimizing Faulty Executions of Distributed Systems
Minimizing Faulty Executions of Distributed Systems - Scott et al. Now that we've spent a couple of days looking at test case minimizing for sequential systems, we're ready to tackle Colin Scott et al.'s paper on doing the same for executions of distributed systems. This is the paper that describes the core system behind Colin's … Continue reading Minimizing Faulty Executions of Distributed Systems
Hierarchical Delta Debugging
Hierarchical Delta Debugging - Misherghi & Su, 2006 The thing I find striking about the delta debugging approach we saw yesterday is that with no understanding of the syntax of the input at all, it is still able to simplify, for example, a C program - despite the fact that nearly all of the subsets … Continue reading Hierarchical Delta Debugging
Simplifying and Isolating Failure-Inducing Input
Simplifying and Isolating Failure-Inducing Input - Zeller et al. 2002 The most common question I get asked about The Morning Paper is 'how do you find time to read so many papers?' The second most common question is 'how do you find interesting papers?' Sometimes it goes like this: Colin Scott writes a great blog … Continue reading Simplifying and Isolating Failure-Inducing Input
Scrap Your Boilerplate with Object Algebras
Scrap Your Boilerplate with Object Algebras - Zhang et al. 2015 We've seen Object Algebras once before on The Morning Paper when we looked at extensible streaming APIs. Today's paper choice uses the extensible properties of object algebras to help remove some of the boilerplate code traditionally associated with implementing visitors that traverse ASTs. The … Continue reading Scrap Your Boilerplate with Object Algebras