Feral Concurrency Control: An Empirical Investigation of Modern Application Integrity

Feral Concurrency Control: An Empirical Investigation of Modern Application Integrity - Bailis et al. 2015 This paper is an absolute joy to read: seasoned database systems researchers conduct a study of real-world applications from the Ruby community and try not to show too much disdain at what they find, whilst pondering what it might all … Continue reading Feral Concurrency Control: An Empirical Investigation of Modern Application Integrity

TAO: Facebook’s Distributed Data Store for the Social Graph

TAO: Facebook's Distributed Data Store for the Social Graph Bronson et al. (Facebook) 2013 A single Facebook page may aggregate and filter hundreds of items from the social graph. We present each user with content tailored to them, and we filter every item with privacy checks that take into account the current viewer. This extreme … Continue reading TAO: Facebook’s Distributed Data Store for the Social Graph

Staring into the abyss: An evaluation of concurrency control with one thousand cores

Staring into the abyss: An evaluation of concurrency control with one thousand cores - Yu et al. 2014 A look at the 7 major concurrency control algorithms for OLTP DBMSs , and how well they perform when scaled to large numbers (1024) of cores. Each algorithm is optimised for the best in-memory performance possible, but … Continue reading Staring into the abyss: An evaluation of concurrency control with one thousand cores

Scaling Concurrent Log-Structured Data Stores

Scaling Concurrent Log-Structured Data Stores - Golan-Gueta et al. 2015 Key-value stores based on log-structured merge trees are everywhere. The original design was intended to mitigate slow disk I/O. Once this is achieved, as we scale to more and more cores the authors find that in-memory contention now becomes the bottleneck (see yesterday's piece on … Continue reading Scaling Concurrent Log-Structured Data Stores

Musketeer – Part II: all for one, and one for all in data processing systems

Musketeer: all for one, one for all in data processing systems - Gog et al. 2015 Musketeer gives you portability of data processing workflows across across data processing systems. It can even analyse your workflow and recommend the best system to run it on, as well as combining systems for different parts of the workflow. … Continue reading Musketeer – Part II: all for one, and one for all in data processing systems

Musketeer – Part I : What’s the best data processing system?

Musketeer: all for one, one for all in data processing systems - Gog et al. 2015 For between 40-80% of the jobs submitted to MapReduce systems, you'd be better off just running them on a single machine... It was Eurosys 2015 last week, and a great new crop of papers were presented. Gog et al. … Continue reading Musketeer – Part I : What’s the best data processing system?

Making Sense of Performance in Data Analytics Frameworks

Making Sense of Performance in Data Analytics Frameworks - Ousterhout et al. 2015 We all know the causes of poor performance in big data analytics workloads: network I/O, disk I/O, and straggler tasks. Ousterhout et al. set out to try and quantify this, and found that what we think we know isn't necessarily so. Yet … Continue reading Making Sense of Performance in Data Analytics Frameworks

Mojim: A Reliable and Highly-Available Non-Volatile Memory System

Mojim: A Reliable and Highly-Available Non-Volatile Memory System - Zhang et al. 2015 This is the second in a series of posts looking at the latest research from the recently held ASPLOS 15 conference. It seems like we've been anticipating NVMM (Non-volatile main memory) for a while now; and there has been plenty of research … Continue reading Mojim: A Reliable and Highly-Available Non-Volatile Memory System

Scalable Atomic Visibility with RAMP Transactions

Scalable Atomic Visibility with RAMP Transactions - Bailis et al. 2014 RAMP transactions came up last week as part of the secret sauce in Coordination avoidance in database systems that contributed to a 25x improvement on the TPC-C benchmark. So what exactly are RAMP transactions and why might we need them? As soon as you … Continue reading Scalable Atomic Visibility with RAMP Transactions