Performance analysis of cloud applications Ardelean et al., NSDI'18 Today’s choice gives us an insight into how Google measure and analyse the performance of large user-facing services such as Gmail (from which most of the data in the paper is taken). It’s a paper in two halves. The first part of the paper demonstrates through … Continue reading Performance analysis of cloud applications
Author: adriancolyer
Stateless datacenter load-balancing with Beamer
Stateless datacenter load-balancing with Beamer Olteanu et al., NSDI'18 We’ve spent the last couple of days looking at datacenter network infrastructure, but we didn’t touch on the topic of load balancing. For a single TCP connection, you want all of the packets to end up at the same destination. Logically, a load balancer (a.k.a. ‘mux’) … Continue reading Stateless datacenter load-balancing with Beamer
Andromeda: performance, isolation, and velocity at scale in cloud network virtualization
Andromeda: performance, isolation, and velocity at scale in cloud network virtualization Dalton et al., NSDI'18 Yesterday we took a look at the Microsoft Azure networking stack, today it’s the turn of the Google Cloud Platform. (It’s a very handy coincidence to have two such experience and system design report papers appearing side by side so … Continue reading Andromeda: performance, isolation, and velocity at scale in cloud network virtualization
Azure accelerated networking: SmartNICs in the public cloud
Azure accelerated networking: SmartNICs in the public cloud Firestone et al., NSDI'18 We’re still on the ‘beyond CPUs’ theme today, with a great paper from Microsoft detailing their use of FPGAs to accelerate networking in Azure. Microsoft have been doing this since 2015, and hence this paper also serves as a wonderful experience report documenting … Continue reading Azure accelerated networking: SmartNICs in the public cloud
NetChain: Scale-free sub-RTT coordination
NetChain: Scale-free sub-RTT coordination Jin et al., NSDI'18 NetChain won a best paper award at NSDI 2018 earlier this month. By thinking outside of the box (in this case, the box is the chassis containing the server), Jin et al. have demonstrated how to build a coordination service (think Apache ZooKeeper) with incredibly low latency … Continue reading NetChain: Scale-free sub-RTT coordination
SmoothOperator: reducing power fragmentation and improving power utilization in large-scale datacenters
SmoothOperator: reducing power fragmentation and improving power utilization in large-scale datacenters Hsu et al., ASPLOS'18 What do you do when your theory of constraints analysis reveals that power has become your major limiting factor? That is, you can’t add more servers to your existing datacenter(s) without blowing your power budget, and you don’t want to … Continue reading SmoothOperator: reducing power fragmentation and improving power utilization in large-scale datacenters
Skyway: connecting managed heaps in distributed big data systems
Skyway: connecting managed heaps in distributed big data systems Nguyen et al., ASPLOS'18 Yesterday we saw how to make Java objects persistent using NVM-backed heaps with Espresso. One of the drawbacks of using that as a persistence mechanism is that they’re only stored in the memory of a single node. If only there was some … Continue reading Skyway: connecting managed heaps in distributed big data systems
Espresso: brewing Java for more non-volatility with non-volatile memory
Espresso: brewing Java for more non-volatility with non-volatile memory Wu et al., ASPLOS'18 What happens when you introduce non-volatile memory (NVM) to the world of Java? In theory, with a heap backed by NVM, we should get persistence for free? It’s not quite that straightforward of course, but Espresso gets you pretty close. There are … Continue reading Espresso: brewing Java for more non-volatility with non-volatile memory
Watching for software inefficiencies with Witch
Watching for software inefficiencies with Witch Wen et al., ASPLOS'18 (The link above is to the ACM Digital Library, if you don’t have membership you should still be able to access the paper pdf by following the link from The Morning Paper blog post directly.) Inefficiencies abound in complex, layered software. These inefficiencies can arise … Continue reading Watching for software inefficiencies with Witch
WSMeter: A performance evaluation methodology for Google’s production warehouse-scale computers
WSMeter: A performance evaluation methodology for Google’s production warehouse-scale computers Lee et al., ASPLOS'18 (The link above is to the ACM Digital Library, if you don’t have membership you should still be able to access the paper pdf by following the link from The Morning Paper blog post directly.) How do you know how well … Continue reading WSMeter: A performance evaluation methodology for Google’s production warehouse-scale computers