LEMNA: explaining deep learning based security applications Guo et al., CCS'18 Understanding why a deep learning model produces the outputs it does is an important part of gaining trust in the model, and in some situations being able to explain decisions is a strong requirement. Today’s paper shows that by carefully considering the architectural features … Continue reading LEMNA: explaining deep learning based security applications
Month: November 2018
Towards usable checksums: automating the integrity verification of web downloads for the masses
Towards usable checksums: automating the integrity verification of web downloads for the masses Cherubini et al., CCS'18 If you tackled Monday’s paper on BEAT you deserve something a little easier to digest today, and ‘Towards usable checksums’ fits the bill nicely! There’s some great data-driven product management going on here as the authors set out … Continue reading Towards usable checksums: automating the integrity verification of web downloads for the masses
BEAT: asynchronous BFT made practical
BEAT: asynchronous BFT made practical Duan et al., CCS'18 Reaching agreement (consensus) is hard enough, doing it in the presence of active adversaries who can tamper with or destroy your communications is much harder still. That’s the world of Byzantine fault tolerance (BFT). We’ve looked at Practical BFT (PBFT) and HoneyBadger on previous editions of … Continue reading BEAT: asynchronous BFT made practical
Uncertainty propagation in data processing systems
Uncertainty propagation in data processing systems Manousakis et al., SoCC'18 When I’m writing an edition of The Morning Paper, I often imagine a conversation with a hypothetical reader sat in a coffee shop somewhere at the start of their day. There are three levels of takeaway from today’s paper choice: If you’re downing a quick … Continue reading Uncertainty propagation in data processing systems
Continuum: a platform for cost-aware low-latency continual learning
Continuum: a platform for cost-aware low-latency continual learning Tian et al., SoCC'18 Let’s start with some broad approximations. Batching leads to higher throughput at the cost of higher latency. Processing items one at a time leads to lower latency and often reduced throughput. We can recover throughput to a degree by throwing horizontally scalable resources … Continue reading Continuum: a platform for cost-aware low-latency continual learning
ScootR: scaling R dataframes on dataflow systems
ScootR: scaling R dataframes on dataflow systems Kunft et al., SoCC'18 The language of big data is Java ( / Scala). The languages of data science are Python and R. So what do you do when you want to run your data science analysis over large amounts of data? ...programming languages with rich support for … Continue reading ScootR: scaling R dataframes on dataflow systems
Overload control for scaling WeChat microservices
Overload control for scaling WeChat microservices Zhou et al., SoCC'18 There are two reasons to love this paper. First off, we get some insights into the backend that powers WeChat; and secondly the authors share the design of the battle hardened overload control system DAGOR that has been in production at WeChat for five years. … Continue reading Overload control for scaling WeChat microservices
Unikernels as processes
Unikernels as processes Williams et al., SoCC'18 Ah, unikernels. Small size, fast booting, tiny attack surface, resource efficient, hard to deploy on existing cloud platforms, and undebuggable in production. There’s no shortage of strong claims on both sides of the fence. See for example: Unikernels: library operating systems for the cloud Jitsu: just-in-time summoning of … Continue reading Unikernels as processes
Debugging distributed systems with why-across-time provenance
Debugging distributed systems with why-across-time provenance Whittaker et al., SoCC'18 This value is 17 here, and it shouldn’t be. Why did the get request return 17? Sometimes the simplest questions can be the hardest to answer. As the opening sentence of this paper states: Debugging distributed systems is hard. The kind of why questions we’re … Continue reading Debugging distributed systems with why-across-time provenance
ApproxJoin: approximate distributed joins
ApproxJoin: approximate distributed joins Le Quoc et al., SoCC'18 GitHub: https://ApproxJoin.github.io The join is a fundamental data processing operation and has been heavily optimised in relational databases. When you’re working with large volumes of unstructured data though, say with a data processing framework such as Flink or Spark, joins become distributed and much more expensive. … Continue reading ApproxJoin: approximate distributed joins