Moment-based quantile sketches for efficient high cardinality aggregation queries

Moment-based quantile sketches for efficient high cardinality aggregation queries Gan et al., VLDB'18 Today we’re temporarily pausing our tour through some of the OSDI’18 papers in order to look at a great sketch-based data structure for quantile queries over high-cardinality aggregates. That’s a bit of a mouthful so let’s jump straight into an example of ... Continue Reading

Noria: dynamic, partially-stateful data-flow for high-performance web applications

Noria: dynamic, partially-stateful data-flow for high-performance web applications Gjengset, Schwarzkopf et al., OSDI'18 I have way more margin notes for this paper than I typically do, and that’s a reflection of my struggle to figure out what kind of thing we’re dealing with here. Noria doesn’t want to fit neatly into any existing box! We’ve ... Continue Reading

RobinHood: tail latency aware caching – dynamic reallocation from cache-rich to cache-poor

RobinHood: tail latency aware caching - dynamic reallocation from cache-rich to cache-poor Berger et al., OSDI'18 It’s time to rethink everything you thought you knew about caching! My mental model goes something like this: we have a set of items that probably follow a power-law of popularity. We have a certain finite cache capacity, and ... Continue Reading

Maelstrom: mitigating datacenter-level disasters by draining interdependent traffic safely and efficiently

Maelstrom: mitigating datacenter-level disasters by draining interdependent traffic safely and efficiently Veeraraghavan et al., OSDI'18 Here’s a really valuable paper detailing four plus years of experience dealing with datacenter outages at Facebook. Maelstrom is the system Facebook use in production to mitigate and recover from datacenter-level disasters. The high level idea is simple: drain traffic ... Continue Reading

LegoOS: a disseminated, distributed OS for hardware resource disaggregation

LegoOS: a disseminated, distributed OS for hardware resource disaggregation Shan et al., OSDI'18 One of the interesting trends in hardware is the proliferation and importance of dedicated accelerators as general purposes CPUs stopped benefitting from Moore’s law. At the same time we’ve seen networking getting faster and faster, causing us to rethink some of the ... Continue Reading

Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding

Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding Hundman et al., KDD'18 How do you effectively monitor a spacecraft? That was the question facing NASA’s Jet Propulsion Laboratory as they looked forward towards exponentially increasing telemetry data rates for Earth Science satellites (e.g., around 85 terabytes/day for a Synthetic Aperture Radar satellite). Spacecraft are ... Continue Reading