Uncovering bugs in Distributed Storage Systems during Testing (not in production!)

May 5, 2016July 27, 2017 ~ adriancolyer ~ 4 Comments

Uncovering bugs in Distributed Storage Systems during Testing (not in production!) - Deligiannis et al. 2016 We interviewed technical leaders and senior managers in Microsoft Azure regarding the top problems in distributed system development. The consensus was that one of the most critical problems today is how to improve testing coverage so that bugs can … Continue reading Uncovering bugs in Distributed Storage Systems during Testing (not in production!)

BTrDB: Optimizing Storage System Design for Timeseries Processing

May 4, 2016July 27, 2017 ~ adriancolyer ~ 4 Comments

BTrDB: Optimizing Storage System Design for Timeseries Processing - Anderson & Culler 2016 It turns out you can accomplish quite a lot with 4,709 lines of Go code! How about a full time-series database implementation, robust enough to be run in production for a year where it stored 2.1 trillion data points, and supporting 119M … Continue reading BTrDB: Optimizing Storage System Design for Timeseries Processing

Gorilla: A fast, scalable, in-memory time series database

May 3, 2016July 27, 2017 ~ adriancolyer ~ 16 Comments

Gorilla: A fast, scalable, in-memory time series database - Pelkonen et al. 2015 Error rates across one of Facebook's sites were spiking. The problem had first shown up through an automated alert triggered by an in-memory time-series database called Gorilla a few minutes after the problem started. One set of engineers mitigated the immediate issue. … Continue reading Gorilla: A fast, scalable, in-memory time series database

Slacker: Fast Distribution with Lazy Docker Containers

May 2, 2016July 27, 2017 ~ adriancolyer ~ 2 Comments

Slacker: Fast Distribution with Lazy Docker Containers - Harter et al. 2016 On you marks, get set, docker run -it ubuntu bash. How long did it take before you saw the bash prompt? In this wonderful FAST'16 paper, Harter et al. analyse what happens behind the scenes when you docker run a container image, and … Continue reading Slacker: Fast Distribution with Lazy Docker Containers

Optimizing Distributed Actor Systems for Dynamic Interactive Services

April 29, 2016July 27, 2017 ~ adriancolyer ~ 1 Comment

Optimizing Distributed Actor Systems for Dynamic Interactive Services - Newell et al. 2016 I'm sure many of you have heard of the Orleans distributed actor system, that was used to build some of the systems supporting Microsoft's online Halo game. Halo Presence is an interactive application which implements presence services for a multi-player game running … Continue reading Optimizing Distributed Actor Systems for Dynamic Interactive Services

Data Tiering in Heterogeneous Memory Systems

April 28, 2016July 27, 2017 ~ adriancolyer ~ 2 Comments

Data Tiering in Heterogeneous Memory Systems - Dulloor et al. 2016 Another fantastic EuroSys 2016 paper for today, and one with results that are of great importance in understanding the cost and performance implications of the new generation of non-volatile memory (NVM) heading to our data centers soon. Furthermore, we also get some great insight … Continue reading Data Tiering in Heterogeneous Memory Systems

GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server

April 27, 2016July 27, 2017 ~ adriancolyer ~ 2 Comments

GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server - Cui et al. 2016 (EuroSys 2016) We know that deep learning is well suited to GPUs since it has inherent parallelism. But so far this has mostly been limited to either a single GPU (e.g. using Caffe) or to specially built distributed … Continue reading GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server

The Linux Scheduler: a Decade of Wasted Cores

April 26, 2016July 27, 2017 ~ adriancolyer ~ 30 Comments

The Linux Scheduler: a Decade of Wasted Cores - Lozi et al. 2016 This is the first in a series of papers from EuroSys 2016. There are three strands here: first of all, there's some great background into how scheduling works in the Linux kernel; secondly, there's a story about Software Aging and how changing … Continue reading The Linux Scheduler: a Decade of Wasted Cores

Delta State Replicated Data Types

April 25, 2016July 27, 2017 ~ adriancolyer

Delta State Replicated Data Types - Almeida et al. 2016 You know when you want to use CRDTs for their convergence properties, but the amount of state you're required to pass around gets out of hand? In this paper, Almeida et al. show how to retain the advantages of state-based CRDTs, but with much smaller … Continue reading Delta State Replicated Data Types

GloVe: Global Vectors for Word Representation

April 22, 2016July 27, 2017 ~ adriancolyer ~ 17 Comments

GloVe: Global Vectors for Word Representation - Pennington et al. 2014 Yesterday we looked at some of the amazing properties of word vectors with word2vec. Pennington et al. argue that the online scanning approach used by word2vec is suboptimal since it doesn't fully exploit statistical information regarding word co-occurrences. They demonstrate a Global Vectors (GloVe) … Continue reading GloVe: Global Vectors for Word Representation