Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing

Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing - Google 2014 Mesa is another in the tapestry of systems that support Google's advertising business. Previously editions of The Morning Paper have covered Photon, Spanner, F1, and F1's online schema update mechanism. Mesa is a highly scalable analytic data warehousing system that stores critical measurement data related … Continue reading Mesa: Geo-Replicated, Near Real-Time, Scalable Data Warehousing

Spanner: Google’s Globally Distributed Database

Spanner: Google's Globally Distributed Database - Google 2012 Since we've spent the last two days looking at F1 and its online asynchronous schema change support, it seems appropriate today to look at Spanner, the system that underpins them both. There are three interesting stories that come out of the paper for me, each of which … Continue reading Spanner: Google’s Globally Distributed Database

Online, Aysnchronous Schema Change in F1

Online, Asynchronous Schema Change in F1 Rae et al. 2013 Continuous deployment and evolution of running services with zero downtime is the holy grail. With stateless services this is comparatively easy to achieve. But once we have stateless services, and especially large volumes of data in a store, things get more difficult. We would ideally … Continue reading Online, Aysnchronous Schema Change in F1

The Log-Structured Merge-Tree (LSM Tree)

The Log-Structured Merge-Tree (LSM Tree) - O'Neil et al. '96. Log-Structured Merge is an important technique used in many modern data stores (for example, BigTable, Cassandra, HBase, Riak, ...). Suppose you have a hierarchy of storage options for data - for example, RAM, SSDs, Spinning disks, with different price/performance characteristics. Furthermore, you have a large … Continue reading The Log-Structured Merge-Tree (LSM Tree)

The Declarative Imperative: Experiences and Conjectures in Distributed Logic

The Declarative Imperative: Experiences and Conjectures in Distributed Logic - Hellerstein 2010. This paper is an extended version of an invited talk that Joe Hellerstein gave to the ACM PODS conference in 2010. The primary audience is therefore database researchers, but there's some good food for thought for the rest of us in there too. … Continue reading The Declarative Imperative: Experiences and Conjectures in Distributed Logic

Highly Available Transactions: Virtues and Limitations

Highly Available Transactions: Virtues and Limitations - Bailis et. al 2014. Since yesterday we looked at the Boom Hierarchy, it seemed fitting today to take a selection from the BOOM project (no relation). Thus earning me the Basil Brush award ;) What a great paper this is, I have so many highlights and annotations on … Continue reading Highly Available Transactions: Virtues and Limitations