LEMNA: explaining deep learning based security applications

November 30, 2018May 25, 2020 ~ Adrian Colyer ~ 8 Comments

LEMNA: explaining deep learning based security applications Guo et al., CCS'18 Understanding why a deep learning model produces the outputs it does is an important part of gaining trust in the model, and in some situations being able to explain decisions is a strong requirement. Today’s paper shows that by carefully considering the architectural features ... Continue Reading

Towards usable checksums: automating the integrity verification of web downloads for the masses

November 28, 2018May 25, 2020 ~ Adrian Colyer ~ 7 Comments

Towards usable checksums: automating the integrity verification of web downloads for the masses Cherubini et al., CCS'18 If you tackled Monday’s paper on BEAT you deserve something a little easier to digest today, and ‘Towards usable checksums’ fits the bill nicely! There’s some great data-driven product management going on here as the authors set out ... Continue Reading

BEAT: asynchronous BFT made practical

November 26, 2018May 25, 2020 ~ Adrian Colyer ~ Leave a comment

BEAT: asynchronous BFT made practical Duan et al., CCS'18 Reaching agreement (consensus) is hard enough, doing it in the presence of active adversaries who can tamper with or destroy your communications is much harder still. That’s the world of Byzantine fault tolerance (BFT). We’ve looked at Practical BFT (PBFT) and HoneyBadger on previous editions of ... Continue Reading

Uncertainty propagation in data processing systems

November 23, 2018May 25, 2020 ~ Adrian Colyer ~ Leave a comment

Uncertainty propagation in data processing systems Manousakis et al., SoCC'18 When I’m writing an edition of The Morning Paper, I often imagine a conversation with a hypothetical reader sat in a coffee shop somewhere at the start of their day. There are three levels of takeaway from today’s paper choice: If you’re downing a quick ... Continue Reading

Continuum: a platform for cost-aware low-latency continual learning

November 21, 2018May 25, 2020 ~ Adrian Colyer ~ 3 Comments

Continuum: a platform for cost-aware low-latency continual learning Tian et al., SoCC'18 Let’s start with some broad approximations. Batching leads to higher throughput at the cost of higher latency. Processing items one at a time leads to lower latency and often reduced throughput. We can recover throughput to a degree by throwing horizontally scalable resources ... Continue Reading

ScootR: scaling R dataframes on dataflow systems

November 19, 2018May 25, 2020 ~ Adrian Colyer ~ Leave a comment

ScootR: scaling R dataframes on dataflow systems Kunft et al., SoCC'18 The language of big data is Java ( / Scala). The languages of data science are Python and R. So what do you do when you want to run your data science analysis over large amounts of data? ...programming languages with rich support for ... Continue Reading

Overload control for scaling WeChat microservices

November 16, 2018May 25, 2020 ~ Adrian Colyer ~ 9 Comments

Overload control for scaling WeChat microservices Zhou et al., SoCC'18 There are two reasons to love this paper. First off, we get some insights into the backend that powers WeChat; and secondly the authors share the design of the battle hardened overload control system DAGOR that has been in production at WeChat for five years. ... Continue Reading

Unikernels as processes

November 14, 2018May 25, 2020 ~ Adrian Colyer ~ 1 Comment

Unikernels as processes Williams et al., SoCC'18 Ah, unikernels. Small size, fast booting, tiny attack surface, resource efficient, hard to deploy on existing cloud platforms, and undebuggable in production. There’s no shortage of strong claims on both sides of the fence. See for example: Unikernels: library operating systems for the cloud Jitsu: just-in-time summoning of ... Continue Reading

Debugging distributed systems with why-across-time provenance

November 12, 2018May 25, 2020 ~ Adrian Colyer ~ 3 Comments

Debugging distributed systems with why-across-time provenance Whittaker et al., SoCC'18 This value is 17 here, and it shouldn’t be. Why did the get request return 17? Sometimes the simplest questions can be the hardest to answer. As the opening sentence of this paper states: Debugging distributed systems is hard. The kind of why questions we’re ... Continue Reading

ApproxJoin: approximate distributed joins

November 9, 2018May 25, 2020 ~ Adrian Colyer ~ 1 Comment

ApproxJoin: approximate distributed joins Le Quoc et al., SoCC'18 GitHub: https://ApproxJoin.github.io The join is a fundamental data processing operation and has been heavily optimised in relational databases. When you’re working with large volumes of unstructured data though, say with a data processing framework such as Flink or Spark, joins become distributed and much more expensive. ... Continue Reading

the morning paper

a random walk through Computer Science research, by Adrian Colyer
Made delightfully fast by strattic

Month: November 2018