The last few weeks have been anything but normal for many of us. I do hope that you and your loved ones are managing to stay safe. My routines have been disrupted too, and with the closure of schools last week it's essentially the Easter holidays one week earlier than expected for my children. At … Continue reading An early end of term
Month: March 2020
Serverless in the wild: characterizing and optimising the serverless workload at a large cloud provider
Serverless in the wild: characterizing and optimising the serverless workload at a large cloud provider, Shahrad et al., arXiv 2020 This is a fresh-from-the-arXivs paper that Jonathan Mace (@mpi_jcmace) drew my attention to on Twitter last week, thank you Jonathan! It's a classic trade-off: the quality of service offered (better service presumably driving more volume … Continue reading Serverless in the wild: characterizing and optimising the serverless workload at a large cloud provider
An empirical guide to the behavior and use of scalable persistent memory
An empirical guide to the behavior and use of scalable persistent memory, Yang et al., FAST'20 We've looked at multiple papers exploring non-volatile main memory and its implications (e.g. most recently 'Efficient lock-free durable sets'). One thing they all had in common is an evaluation using some kind of simulation of the expected behaviour of … Continue reading An empirical guide to the behavior and use of scalable persistent memory
Understanding, detecting and localizing partial failures in large system software
Understanding, detecting and localizing partial failures in large system software, Lou et al., NSDI'20 Partial failures (gray failures) occur when some but not all of the functionalities of a system are broken. On the surface everything can appear to be fine, but under the covers things may be going astray. When a partial failure occurs, … Continue reading Understanding, detecting and localizing partial failures in large system software
When correlation (or lack of it) can be causation
Rex: preventing bugs and misconfiguration in large services using correlated change analysis, Mehta et al., NSDI'20 and Check before you change: preventing correlated failures in service updates, Zhai et al., NSDI'20 Today's post is a double header. I've chosen two papers from NSDI'20 that are both about correlation. Rex is a tool widely deployed across … Continue reading When correlation (or lack of it) can be causation
Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook
Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook, Cao et al., FAST'20 You get good at what you practice. Or in the case of key-value stores, what you benchmark. So if you want to design a system that will offer good real-world performance, it's really useful to have benchmarks that accurately represent real-world workloads. … Continue reading Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook
Building an elastic query engine on disaggregated storage
Building an elastic query engine on disaggregated storage, Vuppalapati, NSDI'20 This paper describes the design decisions behind the Snowflake cloud-based data warehouse. As the saying goes, 'all snowflakes are special' - but what is it exactly that's special about this one? When I think about cloud-native architectures, I think about disaggregation (enabling each resource type … Continue reading Building an elastic query engine on disaggregated storage
Millions of tiny databases
Millions of tiny databases, Brooker et al., NSDI'20 This paper is a real joy to read. It takes you through the thinking processes and engineering practices behind the design of a key part of the control plane for AWS Elastic Block Storage (EBS): the Physalia database that stores configuration information. In the same spirit as … Continue reading Millions of tiny databases
Firecracker: lightweight virtualization for serverless applications
Firecracker: lightweight virtualisation for serverless applications, Agache et al., NSDI'20 Finally the NSDI'20 papers have opened up to the public (as of last week), and what a great looking crop of papers it is. We looked at a couple of papers that had pre-prints available last week, today we'll be looking at one of the … Continue reading Firecracker: lightweight virtualization for serverless applications