Andromeda: performance, isolation, and velocity at scale in cloud network virtualization

Andromeda: performance, isolation, and velocity at scale in cloud network virtualization Dalton et al., NSDI'18 Yesterday we took a look at the Microsoft Azure networking stack, today it’s the turn of the Google Cloud Platform. (It’s a very handy coincidence to have two such experience and system design report papers appearing side by side so … Continue reading Andromeda: performance, isolation, and velocity at scale in cloud network virtualization

WSMeter: A performance evaluation methodology for Google’s production warehouse-scale computers

WSMeter: A performance evaluation methodology for Google’s production warehouse-scale computers Lee et al., ASPLOS'18 (The link above is to the ACM Digital Library, if you don’t have membership you should still be able to access the paper pdf by following the link from The Morning Paper blog post directly.) How do you know how well … Continue reading WSMeter: A performance evaluation methodology for Google’s production warehouse-scale computers

Google workloads for consumer devices: mitigating data movement bottlenecks

Google workloads for consumer devices: mitigating data movement bottlenecks Boroumand et al., ASPLOS'18 What if your mobile device could be twice as fast on common tasks, greatly improving the user experience, while at the same time significantly extending your battery life? This is the feat that the authors of today’s paper pull-off, using a technique … Continue reading Google workloads for consumer devices: mitigating data movement bottlenecks

The QUIC transport protocol: design and Internet-scale deployment

The QUIC transport protocol: design and Internet-scale deployment Langley et al., SIGCOMM’17 QUIC is a transport protocol designed from the ground up by Google improve the performance of HTTPS traffic. The chances are you’ve already used it - QUIC is deployed in Chrome, in the YouTube mobile app, and in the Google Search app on … Continue reading The QUIC transport protocol: design and Internet-scale deployment

TFX: A TensorFlow-based production scale machine learning platform

TFX: A TensorFlow-based production scale machine learning platform Baylor et al., KDD'17 What world-class looks like in online product and service development has been undergoing quite the revolution over the last few years. The series of papers we've been looking at recently can help you to understand where the bar is (it will have moved … Continue reading TFX: A TensorFlow-based production scale machine learning platform

Google Vizier: A service for black-box optimization

Google Vizier: a service for black-box optimization Golovin et al., KDD'17 We finished up last week by looking at the role of an internal (or external) experimentation platform. In today's paper Google remind us that such experimentation is just one form of optimisation. Google Vizier is an internal Google service for optimising pretty much anything. … Continue reading Google Vizier: A service for black-box optimization

BBR: Congestion-based congestion control

BBR: Congestion-based congestion control Cardwell et al., ACM Queue Sep-Oct 2016 With thanks to Hossein Ghodse (@hossg) for recommending today's paper selection. This is the story of how members of Google's make-tcp-fast project developed and deployed a new congestion control algorithm for TCP called BBR (for Bandwidth Bottleneck and Round-trip propagation time), leading to 2-25x … Continue reading BBR: Congestion-based congestion control