Towards usable checksums: automating the integrity verification of web downloads for the masses

Towards usable checksums: automating the integrity verification of web downloads for the masses Cherubini et al., CCS'18 If you tackled Monday’s paper on BEAT you deserve something a little easier to digest today, and ‘Towards usable checksums’ fits the bill nicely! There’s some great data-driven product management going on here as the authors set out ... Continue Reading

Continuum: a platform for cost-aware low-latency continual learning

Continuum: a platform for cost-aware low-latency continual learning Tian et al., SoCC'18 Let’s start with some broad approximations. Batching leads to higher throughput at the cost of higher latency. Processing items one at a time leads to lower latency and often reduced throughput. We can recover throughput to a degree by throwing horizontally scalable resources ... Continue Reading

Unikernels as processes

Unikernels as processes Williams et al., SoCC'18 Ah, unikernels. Small size, fast booting, tiny attack surface, resource efficient, hard to deploy on existing cloud platforms, and undebuggable in production. There’s no shortage of strong claims on both sides of the fence. See for example: Unikernels: library operating systems for the cloud Jitsu: just-in-time summoning of ... Continue Reading

ApproxJoin: approximate distributed joins

ApproxJoin: approximate distributed joins Le Quoc et al., SoCC'18 GitHub: https://ApproxJoin.github.io The join is a fundamental data processing operation and has been heavily optimised in relational databases. When you’re working with large volumes of unstructured data though, say with a data processing framework such as Flink or Spark, joins become distributed and much more expensive. ... Continue Reading