STTR: A system for tracking all vehicles all the time at the edge of the network

August 30, 2018August 29, 2018 ~ adriancolyer ~ 6 Comments

STTR: A system for tracking all vehicles all the time at the edge of the network Xu et al., DEBS'18 With apologies for only bringing you two paper write-ups this week: we moved house, which turns out to be not at all conducive to quiet study of research papers! Today’s smart camera surveillance systems are … Continue reading STTR: A system for tracking all vehicles all the time at the edge of the network

ServiceFabric: a distributed platform for building microservices in the cloud

June 5, 2018May 29, 2018 ~ adriancolyer ~ 19 Comments

ServiceFabric: a distributed platform for building microservices in the cloud Kakivaya et al., EuroSys'18 (If you don’t have ACM Digital Library access, the paper can be accessed either by following the link above directly from The Morning Paper blog site). Microsoft’s Service Fabric powers many of Azure’s critical services. It’s been in development for around … Continue reading ServiceFabric: a distributed platform for building microservices in the cloud

SmoothOperator: reducing power fragmentation and improving power utilization in large-scale datacenters

April 27, 2018April 22, 2018 ~ adriancolyer

SmoothOperator: reducing power fragmentation and improving power utilization in large-scale datacenters Hsu et al., ASPLOS'18 What do you do when your theory of constraints analysis reveals that power has become your major limiting factor? That is, you can’t add more servers to your existing datacenter(s) without blowing your power budget, and you don’t want to … Continue reading SmoothOperator: reducing power fragmentation and improving power utilization in large-scale datacenters

Skyway: connecting managed heaps in distributed big data systems

April 26, 2018May 10, 2018 ~ adriancolyer ~ 3 Comments

Skyway: connecting managed heaps in distributed big data systems Nguyen et al., ASPLOS'18 Yesterday we saw how to make Java objects persistent using NVM-backed heaps with Espresso. One of the drawbacks of using that as a persistence mechanism is that they’re only stored in the memory of a single node. If only there was some … Continue reading Skyway: connecting managed heaps in distributed big data systems

WSMeter: A performance evaluation methodology for Google’s production warehouse-scale computers

April 23, 2018April 22, 2018 ~ adriancolyer

WSMeter: A performance evaluation methodology for Google’s production warehouse-scale computers Lee et al., ASPLOS'18 (The link above is to the ACM Digital Library, if you don’t have membership you should still be able to access the paper pdf by following the link from The Morning Paper blog post directly.) How do you know how well … Continue reading WSMeter: A performance evaluation methodology for Google’s production warehouse-scale computers

Anna: A KVS for any scale

March 27, 2018March 24, 2018 ~ adriancolyer ~ 13 Comments

Anna: A KVS for any scale Wu et al., ICDE'18 This work comes out of the RISE project at Berkeley, and regular readers of The Morning Paper will be familiar with much of the background. Here’s how Joe Hellerstein puts it in his blog post introducing the work: As researchers, we asked the counter-cultural question: … Continue reading Anna: A KVS for any scale

Protocol aware recovery for consensus-based storage

February 27, 2018February 24, 2018 ~ adriancolyer ~ 3 Comments

Protocol aware recovery for consensus based storage Alagappan et al., FAST’18 Following on from their excellent previous work on ‘All file systems are not created equal’ (well worth a read if you haven’t encountered it yet), in this paper the authors look at how well some of our most reliable protocols — those used in … Continue reading Protocol aware recovery for consensus-based storage

Fail-slow at scale: evidence of hardware performance faults in large production systems

February 26, 2018February 24, 2018 ~ adriancolyer ~ 6 Comments

Fail-slow at scale: evidence of hardware performance faults in large production systems Gunawi et al., FAST’18 The first thing that strikes you about this paper is the long list of authors from multiple different establishments. That’s because it’s actually a study of 101 different fail-slow hardware incidents collected across large-scale cluster deployments in 12 different … Continue reading Fail-slow at scale: evidence of hardware performance faults in large production systems

SoK: Consensus in the age of blockchains

February 12, 2018February 9, 2018 ~ adriancolyer ~ 6 Comments

SoK: Consensus in the age of blockchains Bano et al., arXiv 2017 There are so many things to consider when evaluating a blockchain based technology / system. For example (and in no particular order): What cryptographic building blocks does it depend on? Does it offer privacy of identity (anonymity)? Does it offer privacy of data? … Continue reading SoK: Consensus in the age of blockchains

Why is random testing effective for partition tolerance bugs?

January 23, 2018January 19, 2018 ~ adriancolyer ~ 1 Comment

Why is random testing effective for partition tolerance bugs? Majumdar & Niksic, POPL 18 A little randomness is a powerful thing! It can make the impossible possible (FLP ), balance systems remarkably well (the power of two random choices), and of course underpin much of cryptography. Today’s paper choice examines the unreasonable effectiveness of random … Continue reading Why is random testing effective for partition tolerance bugs?