Reducing Crash Recoverability to Reachability

February 4, 2016 ~ Adrian Colyer ~ 1 Comment

Reducing Crash Recoverability to Reachability - Koskinen & Yang 2016. Techniques such as shadow paging and write-ahead logging can help with recovery from crashes, but even then it takes a lot of sophistication to get it right and deal with cases such as crashing during recovery itself. In today's paper Koskinen and Yang first provide ... Continue Reading

The Design and Implementation of the Wave Transactional Filesystem

January 25, 2016 ~ Adrian Colyer ~ 5 Comments

The Design and Implementation of the Wave Transactional Filesystem - Escriva & Sirer 2015 Since we've been looking at various combinations of storage and transactions, it seemed appropriate to start this week with the Wave Transactional Filesystem. Throughout the paper you'll find this abbreviated as WTF, but my brain can't read that without supplying the ... Continue Reading

Split-Level IO Scheduling

October 28, 2015 ~ Adrian Colyer ~ 1 Comment

Split-Level IO Scheduling - Yang et al. 2015 The central idea in today's paper is pretty simple: block-level I/O schedulers (the most common kind) lack the higher level information necessary to perform write-reordering and accurate accounting, whereas system-call level schedulers have the appropriate context but lack the low-level knowledge needed to build efficient schedulers - ... Continue Reading

IPFS – Content Addressed, Versioned, P2P File System

October 5, 2015 ~ Adrian Colyer ~ 3 Comments

IPFS - Content Addressed, Versioned, P2P File System - Benet 2014 This paper has sat on my reading list for almost a year! I first heard about it in Joe Armstrong's 2014 talk at CodeMesh "Connecting things together is really difficult but it could and should be rather easy". CodeMesh 2015 is just around the ... Continue Reading

RIPQ: Advanced photo caching on flash for Facebook

February 27, 2015 ~ Adrian Colyer ~ Leave a comment

RIPQ: Advanced Photo Caching on Flash for Facebook - Tang et al. 2015 It's three for the price of one with this paper: we get to deepen our understanding of the characteristics of flash, examine a number of priority queue and caching algorithms, and get a glimpse into what's behind an important part of Facebook's ... Continue Reading

F2FS: A new file system for flash storage

February 26, 2015 ~ Adrian Colyer ~ 2 Comments

F2FS: A New File System for Flash Storage - Lee et al. 2015 For the second half of February's research conference highlights, we're visiting FAST '15, the File and Storage Technologies conference. We've seen a few statements so far in this series that Flash storage has the potential to be very disruptive. But beyond the ... Continue Reading

A Hitchhiker’s Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centers

December 17, 2014 ~ Adrian Colyer ~ 2 Comments

A Hitchhiker's guide to fast and efficient data reconstruction in erasure-coded data centers - Rashmi et al. So far this week we've looked at a programming languages paper and a systems paper, so for today I thought it would be fun to look at an algorithm-based paper. HDFS enables horizontally scalable low-cost storage for the ... Continue Reading

f4: Facebook’s warm BLOB storage system

December 16, 2014 ~ Adrian Colyer ~ 5 Comments

f4: Facebook's warm BLOB storage system - Muralidhar et al. 2014 This is a story of system engineering trade-offs, a design informed by data analysis, and hard-won experience. It's the story of how Facebook implemented a tiered storage solution for BLOBs and introduced per data class (temperature) replication factor, latency, and time-to-recovery tuning. If you're ... Continue Reading

Tachyon: Reliable, Memory Speed Storage for Cluster Computing

December 4, 2014 ~ Adrian Colyer ~ 3 Comments

Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks - Li et al. 2014 Data processing can often be naturally expressed as a sequence of steps in a pipeline. For example, the unix command line below that pipes a file through a series of transforms to ultimately generate some output. cat Fin.csv | a | ... Continue Reading

An Evaluation of Amazon S3’s Consistency Behavior

November 12, 2014 ~ Adrian Colyer ~ 2 Comments

Eventual Consistency: How soon is eventual? An Evaluation of Amazon S3's Consistency Behavior - Bermbach and Tai, 2011 In honour of AWS re:Invent this week, and since we've already covered the excellent Dynamo paper at #31 in this series, here's a paper looking at eventual consistency and the behaviour of S3. In this work we ... Continue Reading

the morning paper

a random walk through Computer Science research, by Adrian Colyer
Made delightfully fast by strattic

Storage