ApproxJoin: approximate distributed joins

November 9, 2018May 25, 2020 ~ Adrian Colyer ~ 1 Comment

ApproxJoin: approximate distributed joins Le Quoc et al., SoCC'18 GitHub: https://ApproxJoin.github.io The join is a fundamental data processing operation and has been heavily optimised in relational databases. When you’re working with large volumes of unstructured data though, say with a data processing framework such as Flink or Spark, joins become distributed and much more expensive. ... Continue Reading

ASAP: fast, approximate graph pattern mining at scale

November 7, 2018May 25, 2020 ~ Adrian Colyer ~ 5 Comments

ASAP: fast, approximate graph pattern mining at scale Iyer et al., OSDI'18 I have a real soft spot for approximate computations. In general, we waste a lot of resources on overly accurate analyses when understanding the trends and / or the neighbourhood is quite good enough (do you really need to know it’s 78.763895% vs ... Continue Reading

Sharding the shards: managing datastore locality at scale with Akkio

November 5, 2018May 25, 2020 ~ Adrian Colyer ~ 3 Comments

Sharding the shards: managing datastore locality at scale with Akkio Annamalai et al., OSDI'18 In Harry Potter, the Accio Summoning Charm summons an object to the caster of the spell, sometimes transporting it over a significant distance. In Facebook, Akkio summons data to a datacenter with the goal of improving data access locality for clients. ... Continue Reading

The FuzzyLog: a partially ordered shared log

November 2, 2018May 25, 2020 ~ Adrian Colyer ~ 1 Comment

The FuzzyLog: a partially ordered shared log Lockerman et al., OSDI'18 If you want to build a distributed system then having a distributed shared log as an abstraction to build upon — one that gives you an agreed upon total order for all events — is such a big help that it’s practically cheating! (See ... Continue Reading

Moment-based quantile sketches for efficient high cardinality aggregation queries

October 31, 2018May 25, 2020 ~ Adrian Colyer ~ 1 Comment

Moment-based quantile sketches for efficient high cardinality aggregation queries Gan et al., VLDB'18 Today we’re temporarily pausing our tour through some of the OSDI’18 papers in order to look at a great sketch-based data structure for quantile queries over high-cardinality aggregates. That’s a bit of a mouthful so let’s jump straight into an example of ... Continue Reading

Noria: dynamic, partially-stateful data-flow for high-performance web applications

October 29, 2018May 25, 2020 ~ Adrian Colyer ~ 3 Comments

Noria: dynamic, partially-stateful data-flow for high-performance web applications Gjengset, Schwarzkopf et al., OSDI'18 I have way more margin notes for this paper than I typically do, and that’s a reflection of my struggle to figure out what kind of thing we’re dealing with here. Noria doesn’t want to fit neatly into any existing box! We’ve ... Continue Reading

RobinHood: tail latency aware caching – dynamic reallocation from cache-rich to cache-poor

October 26, 2018May 25, 2020 ~ Adrian Colyer ~ 11 Comments

RobinHood: tail latency aware caching - dynamic reallocation from cache-rich to cache-poor Berger et al., OSDI'18 It’s time to rethink everything you thought you knew about caching! My mental model goes something like this: we have a set of items that probably follow a power-law of popularity. We have a certain finite cache capacity, and ... Continue Reading

Maelstrom: mitigating datacenter-level disasters by draining interdependent traffic safely and efficiently

October 24, 2018May 25, 2020 ~ Adrian Colyer ~ 4 Comments

Maelstrom: mitigating datacenter-level disasters by draining interdependent traffic safely and efficiently Veeraraghavan et al., OSDI'18 Here’s a really valuable paper detailing four plus years of experience dealing with datacenter outages at Facebook. Maelstrom is the system Facebook use in production to mitigate and recover from datacenter-level disasters. The high level idea is simple: drain traffic ... Continue Reading

LegoOS: a disseminated, distributed OS for hardware resource disaggregation

October 22, 2018May 25, 2020 ~ Adrian Colyer ~ 6 Comments

LegoOS: a disseminated, distributed OS for hardware resource disaggregation Shan et al., OSDI'18 One of the interesting trends in hardware is the proliferation and importance of dedicated accelerators as general purposes CPUs stopped benefitting from Moore’s law. At the same time we’ve seen networking getting faster and faster, causing us to rethink some of the ... Continue Reading

Orca: differential bug localization in large-scale services

October 19, 2018May 25, 2020 ~ Adrian Colyer ~ 8 Comments

Orca: differential bug localization in large-scale services Bhagwan et al., OSDI'18 Earlier this week we looked at REPT, the reverse debugging tool deployed live in the Windows Error Reporting service. Today it’s the turn of Orca, a bug localisation service that Microsoft have in production usage for six of their large online services. The focus ... Continue Reading

the morning paper

a random walk through Computer Science research, by Adrian Colyer
Made delightfully fast by strattic

Uncategorized