Redundancy does not imply fault tolerance: analysis of distributed storage reactions to single errors and corruptions

Redundancy does not imply fault tolerance: analysis of distributed storage reactions to single errors and corruptions Ganesan et al., FAST 2017 It's a tough life being the developer of a distributed datastore. Thanks to the wonderful work of Kyle Kingsbury (aka, @aphyr) and his efforts on Jepsen.io, awareness of data loss and related issues in … Continue reading Redundancy does not imply fault tolerance: analysis of distributed storage reactions to single errors and corruptions

HopFS: Scaling hierarchical file system metadata using NewSQL databases

HopFS: Scaling hierarchical file system metadata using NewSQL databases Niazi et al., FAST 2017 If you're working with big data and Hadoop, this one paper could repay your investment in The Morning Paper many times over (ok, The Morning Paper is free - but you do pay with your time to read it). You know … Continue reading HopFS: Scaling hierarchical file system metadata using NewSQL databases

Write-limited sorts and joins for persistent memory

Write-limited sorts and joins for persistent memory Viglas, VLDB 2014 This is the second of the two research-for-practice papers for this week. Once more the topic is how database storage algorithms can be optimised for NVM, this time examining the asymmetry between reads and writes on NVM. This is premised on Viglas’ assertion that: Writes … Continue reading Write-limited sorts and joins for persistent memory

Let’s talk about storage and recovery methods for non-volatile memory database systems

Let's talk about storage and recovery methods for non-volatile memory database systems Arulraj et al., SIGMOD 2015 Update: fixed a bunch of broken links. I can't believe I only just found out about this paper! It's exactly what I've been looking for in terms of an analysis of the impacts of NVM on data storage … Continue reading Let’s talk about storage and recovery methods for non-volatile memory database systems

Ambry: LinkedIn’s scalable geo-distributed object store

Ambry: LinkedIn’s scalable geo-distributed object store Noghabi et al. SIGMOD ’16 Ambry is LinkedIn’s blob store, designed to handle the demands of a modern social network: Hundreds of millions of users continually upload and view billions of diverse massive media objects, from photos and videos to documents. These large media objects, called blobs, are uploaded … Continue reading Ambry: LinkedIn’s scalable geo-distributed object store