Taiji: managing global user traffic for large-scale Internet services at the edge

Taiji: managing global user traffic for large-scale internet services at the edge Xu et al., SOSP'19 It’s another networking paper to close out the week (and our coverage of SOSP’19), but whereas Snap looked at traffic routing within the datacenter, Taiji is concerned with routing traffic from the edge to a datacenter. It’s been in ... Continue Reading

RPCValet: NI-driven tail-aware balancing of µs-scale RPCs

RPCValet: NI-driven tail-aware balancing of µs-scale RPCs Daglis et al., ASPLOS'19 Last week we learned about the [increased tail-latency sensitivity of microservices based applications with high RPC fan-outs. Seer uses estimates of queue depths to mitigate latency spikes on the order of 10-100ms, in conjunction with a cluster manager. Today’s paper choice, RPCValet, operates at ... Continue Reading

Slim: OS kernel support for a low-overhead container overlay network

Slim: OS kernel support for a low-overhead container overlay network Zhuo et al., NSDI'19 Container overlay networks rely on packet transformations, with each packet traversing the networking stack twice on its way from the sending container to the receiving container. There are CPU, throughput, and latency overheads associated with those traversals. In this paper, we ... Continue Reading

Understanding lifecycle management complexity of datacenter topologies

Understanding lifecycle management complexity of datacenter topologies Zhang et al., NSDI'19 There has been plenty of interesting research on network topologies for datacenters, with Clos-like tree topologies and Expander based graph topologies both shown to scale using widely deployed hardware. This research tends to focus on performance properties such as throughput and latency, together with ... Continue Reading

The case for network-accelerated query processing

The case for network-accelerated query processing Lerner et al., CIDR'19 Datastores continue to advance on a number of fronts. Some of those that come to mind are adapting to faster networks (e.g. ‘FARM: Fast Remote Memory’) and persistent memory (see e.g. ‘Let’s talk about storage and recovery methods for non-volatile memory database systems’), deeply integrating ... Continue Reading

BDS: A centralized near-optimal overlay network for inter-datacenter data replication

BDS: A centralized near-optimal overlay network for inter-datacenter data replication Zhang et al., EuroSys'18 (If you don’t have ACM Digital Library access, the paper can be accessed either by following the link above directly from The Morning Paper blog site). This is the story of how inter-datacenter multicast transfers at Baidu were sped-up by a ... Continue Reading

Andromeda: performance, isolation, and velocity at scale in cloud network virtualization

Andromeda: performance, isolation, and velocity at scale in cloud network virtualization Dalton et al., NSDI'18 Yesterday we took a look at the Microsoft Azure networking stack, today it’s the turn of the Google Cloud Platform. (It’s a very handy coincidence to have two such experience and system design report papers appearing side by side so ... Continue Reading