File systems unfit as distributed storage backends: lessons from ten years of Ceph evolution

November 6, 2019November 7, 2019 ~ adriancolyer ~ 19 Comments

File systems unfit as distributed storage backends: lessons from 10 years of Ceph evolution Aghayev et al., SOSP'19 Ten years of hard-won lessons packed into just 17 pages (13 if you don’t count the references!) makes this paper extremely good value for your time. It’s also a fabulous example of recognising and challenging implicit assumptions. … Continue reading File systems unfit as distributed storage backends: lessons from ten years of Ceph evolution

Towards web-based delta synchronization for cloud storage systems

March 2, 2018February 24, 2018 ~ adriancolyer ~ 2 Comments

Towards web-based delta synchronization for cloud storage systems Xiao et al., FAST’18 If you use Dropbox (or an equivalent service) to synchronise file between your Mac or PC and the cloud, then it uses an efficient delta-sync (rsync) protocol to only upload the parts of a file that have changed. If you use a web … Continue reading Towards web-based delta synchronization for cloud storage systems

Clay codes: moulding MDS codes to yield an MSR code

March 1, 2018February 24, 2018 ~ adriancolyer ~ 1 Comment

Clay codes: moulding MDS codes to yield an MSR code Vajha et al., FAST’18 As we know, storage fails (or the nodes to which it is directly attached, which amounts to pretty much the same thing). Assuming we can detect the failure, we need to recover from it. In order to be able to recover, … Continue reading Clay codes: moulding MDS codes to yield an MSR code

Barrier-enabled IO stack for Flash storage

February 28, 2018February 24, 2018 ~ adriancolyer ~ 1 Comment

Barrier-enabled IO stack for flash storage Won et al., FAST’18 The performance of Flash storage has benefited greatly from concurrency and parallelism - for example, multi-channel controllers, large caches, and deep command queues. At the same time, the time to program an individual Flash cell has stayed fairly static (and even become slightly worse in … Continue reading Barrier-enabled IO stack for Flash storage

Protocol aware recovery for consensus-based storage

February 27, 2018February 24, 2018 ~ adriancolyer ~ 3 Comments

Protocol aware recovery for consensus based storage Alagappan et al., FAST’18 Following on from their excellent previous work on ‘All file systems are not created equal’ (well worth a read if you haven’t encountered it yet), in this paper the authors look at how well some of our most reliable protocols — those used in … Continue reading Protocol aware recovery for consensus-based storage

Azure Data Lake Store: a hyperscale distributed file service for big data analytics

July 4, 2017July 2, 2017 ~ adriancolyer ~ 3 Comments

Azure data lake store: a hyperscale distributed file service for big data analytics Douceur et al., SIGMOD'17 Today's paper takes us inside Microsoft Azure's distributed file service called the Azure Data Lake Store (ADLS). ADLS is the successor to an internal file system called Cosmos, and marries Cosmos semantics with HDFS, supporting both Cosmos and … Continue reading Azure Data Lake Store: a hyperscale distributed file service for big data analytics

The design, implementation and deployment of a system to transparently compress hundreds of petabytes of image files for a file storage service

May 1, 2017April 30, 2017 ~ adriancolyer ~ 3 Comments

The design, implementation, and deployment of a system to transparently compress hundreds of petabytes of image files for a file storage service Horn et al., NSDI'17 When I first started reading, I thought this paper was going to be about a new compression format Dropbox had introduced for JPEG images. And it is about that, … Continue reading The design, implementation and deployment of a system to transparently compress hundreds of petabytes of image files for a file storage service

Application crash consistency and performance with CCFS

March 15, 2017July 31, 2017 ~ adriancolyer ~ 3 Comments

Application crash consistency and performance with CCFS Pillai et al., FAST 2017 I know I tend to get over-excited about some of the research I cover, but this is truly a fabulous piece of work. We looked "All file systems are not created equal" in a previous edition of The Morning Paper, which showed that … Continue reading Application crash consistency and performance with CCFS

Redundancy does not imply fault tolerance: analysis of distributed storage reactions to single errors and corruptions

March 8, 2017July 31, 2017 ~ adriancolyer ~ 10 Comments

Redundancy does not imply fault tolerance: analysis of distributed storage reactions to single errors and corruptions Ganesan et al., FAST 2017 It's a tough life being the developer of a distributed datastore. Thanks to the wonderful work of Kyle Kingsbury (aka, @aphyr) and his efforts on Jepsen.io, awareness of data loss and related issues in … Continue reading Redundancy does not imply fault tolerance: analysis of distributed storage reactions to single errors and corruptions

HopFS: Scaling hierarchical file system metadata using NewSQL databases

March 6, 2017July 31, 2017 ~ adriancolyer ~ 10 Comments

HopFS: Scaling hierarchical file system metadata using NewSQL databases Niazi et al., FAST 2017 If you're working with big data and Hadoop, this one paper could repay your investment in The Morning Paper many times over (ok, The Morning Paper is free - but you do pay with your time to read it). You know … Continue reading HopFS: Scaling hierarchical file system metadata using NewSQL databases