A Hitchhiker’s Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centers

A Hitchhiker's guide to fast and efficient data reconstruction in erasure-coded data centers - Rashmi et al. So far this week we've looked at a programming languages paper and a systems paper, so for today I thought it would be fun to look at an algorithm-based paper. HDFS enables horizontally scalable low-cost storage for the … Continue reading A Hitchhiker’s Guide to Fast and Efficient Data Reconstruction in Erasure-coded Data Centers

f4: Facebook’s warm BLOB storage system

f4: Facebook's warm BLOB storage system - Muralidhar et al. 2014 This is a story of system engineering trade-offs, a design informed by data analysis, and hard-won experience. It's the story of how Facebook implemented a tiered storage solution for BLOBs and introduced per data class (temperature) replication factor, latency, and time-to-recovery tuning. If you're … Continue reading f4: Facebook’s warm BLOB storage system

Predicate Logic as Programming Language

Predicate Logic as Programming Language - Kowalski 1974 The purpose of programming languages is to enable the communication from man to machine of problems and their general means of solution. Kowalski shows us that predicate logic can be used as the basis of a "useful and practical, high-level, non-deterministic programming language with sound theoretical foundations." … Continue reading Predicate Logic as Programming Language

Scala Actors: Unifying thread-based and event-based programming

Scala Actors: Unifying thread-based and event-based programming - Haller & Odersky 2008 Yesterday we saw a Haskell-based approach to unifying events and threads, today's paper shows how to apply some of those same ideas on top of the JVM using Scala. There is an impedance mismatch between message-passing concurrency and virtual machines, such as the … Continue reading Scala Actors: Unifying thread-based and event-based programming

A language-based approach to unifying events and threads

A Language-based Approach to Unifying Events and Threads - Li and Zdancewic, 2006 So far in this mini-series we've seen that thread and event-based models are duals, that threads are a bad idea - you should really be using events, and that events are a bad idea - you should really be using threads. What … Continue reading A language-based approach to unifying events and threads

On the duality of operating system structures

On the Duality of Operating System Structures - Lauer and Needham, 1978 The pendulum currently says "threads and locks are bad, events are good." Vigourous defences are mounted in favour of one system over the other, and manifestos are written. Nowadays this debate rages over the best way to build applications and frameworks, but it … Continue reading On the duality of operating system structures

Tachyon: Reliable, Memory Speed Storage for Cluster Computing

Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks - Li et al. 2014 Data processing can often be naturally expressed as a sequence of steps in a pipeline. For example, the unix command line below that pipes a file through a series of transforms to ultimately generate some output. cat Fin.csv | a | … Continue reading Tachyon: Reliable, Memory Speed Storage for Cluster Computing

Photon: Fault-tolerant and scalable joining of continuous data streams

Photon: Fault-tolerant and scalable joining of continuous data streams - Google 2013 To the best of our knowledge, this is the first paper to formulate and solve the problem of joining multiple streams continuously under these system constraints: exactly-once semantics, fault-tolerance at datacenter-level, high scalability, low latency, unordered streams, and delayed primary stream. It's interesting … Continue reading Photon: Fault-tolerant and scalable joining of continuous data streams