Adaptive logging: optimizing logging and recovery costs in distributed in-memory databases

Adaptive Logging: Optimizing logging and recovery costs in distributed In-memory databases Yao et al., SIGMOD 2016 This is a paper about the trade-offs between transaction throughput and database recovery time. Intuitively for example, you can do a little more work on each transaction (lowering throughput) in order to reduce the time it takes to recover … Continue reading Adaptive logging: optimizing logging and recovery costs in distributed in-memory databases

Apache Hadoop YARN: Yet another resource negotiator

Apache Hadoop YARN: Yet Another Resource Negotiator Vavilapalli et al., SoCC 2013 The opening section of Prof. Demirbas' reading list is concerned with programming the datacenter, aka 'the Datacenter Operating System' - though I can't help but think of Mesosphere when I hear that latter phrase. There are four papers: in publication order these are … Continue reading Apache Hadoop YARN: Yet another resource negotiator

“A Distributed Systems Seminar Reading List,” Spring 2017 edition

Update: links giving 404s were too confusing, so I've removed links to not-yet published posts and will add them back in at the end of week! Last year we looked at Murat Demirbas' Distributed systems seminar reading list for Spring 2016. Now of course it's 2017 and Prof. Demirbas has a new list of papers … Continue reading “A Distributed Systems Seminar Reading List,” Spring 2017 edition

Diamond: Automating data management and storage for wide-area, reactive applications

Diamond: Automating data management and storage for wide-area, reactive applications Zhang et al., OSDI 2016 Diamond tackles the end-to-end problem of building reactive applications, defined here as those that update end-user visible state without requiring any explicit user action: … today’s popular applications are reactive: they provide users with the illusion of continuous synchronization across … Continue reading Diamond: Automating data management and storage for wide-area, reactive applications

XFT: Practical fault-tolerance beyond crashes

XFT: Practical fault-tolerance beyond crashes Liu et al., OSDI 2016 Here’s something that’s been bugging me for a while now. The state of the art in security has moved from the assumption of a secured perimeter and a trusted environment inside the firewall to a notion of perimeter-less security. It’s pretty much impossible to keep … Continue reading XFT: Practical fault-tolerance beyond crashes

DQBarge: Improving data-quality tradeoffs in large-scale internet services

DQBarge: Improving data-quality tradeoffs in large-scale Internet services Chow et al. OSDI 2106 I'm sure many of you recall the 2009 classic "The Datacenter as a Computer," which encouraged us to think of the datacenter as a warehouse-scale computer. From being glad simply to have such a computer, the bar keeps on moving. We don't … Continue reading DQBarge: Improving data-quality tradeoffs in large-scale internet services

Just say NO to Paxos overhead: replacing consensus with network ordering

Just say NO to Paxos overhead: replacing consensus with network ordering Li et al., OSDI 2016 Everyone knows that consensus systems such as Paxos, Viewstamped Replication, and Raft impose high overhead and have limited throughput and scalability. Li et al. carefully examine the assumptions on which those systems are based, and finds out that within … Continue reading Just say NO to Paxos overhead: replacing consensus with network ordering

The SNOW theorem and latency-optimal read-only transactions

The SNOW theorem and latency-optimal read-only transactions Lu et al., OSDI 2016 Consider a read-only workload (as in 100%). You can make that really fast - never any need to coordinate, never any need to invalidate any cached values… Now consider a write-only workload - you can make that even faster, if no-one’s ever going … Continue reading The SNOW theorem and latency-optimal read-only transactions

Slicer: Auto-sharding for datacenter applications

Slicer: Auto-sharding for datacenter applications Adya et al. (Google)  OSDI 2016 Another piece of Google's back-end infrastructure is revealed in this paper, ready to spawn some new open source implementations of the same ideas no doubt. Slicer is a general purpose sharding service. I normally think of sharding as something that happens within a (typically … Continue reading Slicer: Auto-sharding for datacenter applications