Extensible Distributed Coordination

Extensible Distributed Coordination - Distler et al. 2015 Coordination services such as ZooKeeper offer a deliberately limited API. As a consequence, more complex coordination tasks have to be implemented as multiple RPCs. In Extensible Distributed Coordination, Distler et al. describe a sandboxed extension mechanism for coordination services that allows execution of client logic in the … Continue reading Extensible Distributed Coordination

Taming uncertainty in distributed systems with help from the network

Taming uncertainty in distributed systems with help from the network - Leners et al. 2015 Albatross is a membership service with a very interesting new twist: it exploits SDN functionality to actively enforce partitions! Perhaps it is not immediately obvious why that might be a good thing :). It turns out there are several benefits: … Continue reading Taming uncertainty in distributed systems with help from the network

Putting Consistency Back into Eventual Consistency

Putting Consistency Back into Eventual Consistency - Balegas et al. 2015 Today's choice is another pick from the recent crop of Eurosys 2015 papers. Balegas et al. show us that we don't have to put up with weak forms of eventual consistency, even in geo-replicated settings. In Building on Quicksand Helland argued that we need … Continue reading Putting Consistency Back into Eventual Consistency

Distributed Snapshots: Determining Global States of Distributed Systems

Distributed Snapshots: Determining Global States of Distributed Systems - Chandy & Lamport 1985. What state is your distributed system in? In the absence of a universal clock, is that even a well-formed question? And if you could take a distributed snapshot of system state, would that be useful? Through an algorithm that has simply become … Continue reading Distributed Snapshots: Determining Global States of Distributed Systems

Cross-layer scheduling in cloud systems

Cross-layer scheduling in cloud systems - Alkaff et al. 2015 This paper was presented last month at the 2015 International Conference on Cloud Engineering, and explores what happens when you coordinate application scheduling with network route allocation via SDN (hence: cross-layer scheduling). With clusters of 30 nodes, the authors demonstrate results that can improve the … Continue reading Cross-layer scheduling in cloud systems

SAMC: Semantic-aware model checking for fast discovery of deep bugs in cloud systems

SAMC: Semantic-aware model checking for fast discovery of deep bugs in cloud systems - Leesatapornwongsa et al. 2014 This is the second of three papers we'll be looking at this week on the theme of verifying correctness of, and catching bugs in, distributed systems. Yesterday we saw the Statecall Policy Language and associated tool chain … Continue reading SAMC: Semantic-aware model checking for fast discovery of deep bugs in cloud systems