Towards federated learning at scale: system design

June 7, 2019June 4, 2019 ~ adriancolyer ~ 1 Comment

Towards federated learning at scale: system design Bonawitz et al., SysML 2019 This is a high level paper describing Google’s production system for federated learning. One of the most interesting things to me here is simply to know that Google are working on this, have a first version in production working with tens of millions … Continue reading Towards federated learning at scale: system design

Data validation for machine learning

June 5, 2019June 3, 2019 ~ adriancolyer ~ 15 Comments

Data validation for machine learning Breck et al., SysML'19 Last time out we looked at continuous integration testing of machine learning models, but arguably even more important than the model is the data. Garbage in, garbage out. In this paper we focus on the problem of validation the input data fed to ML pipelines. The … Continue reading Data validation for machine learning

Continuous integration of machine learning models with ease.ml/ci

June 3, 2019June 2, 2019 ~ adriancolyer ~ 2 Comments

Continuous integration of machine learning models with ease.ml/ci: towards a rigorous yet practical treatment Renggli et al., SysML'19 Developing machine learning models is no different from developing traditional software, in the sense that it is also a full life cycle involving design, implementation, tuning, testing, and deployment. As machine learning models are used in more … Continue reading Continuous integration of machine learning models with ease.ml/ci

A case for lease-based, utilitarian resource management on mobile devices

May 31, 2019May 26, 2019 ~ adriancolyer ~ 4 Comments

A case for lease-based, utilitarian resource management on mobile devices Hu et al., ASPLOS'19 I’ve chosen another energy-related paper to end the week, addressing a problem many people can relate to: apps that drain your battery. LeaseOS borrows the concept of a lease from distributed systems, but with a rather nice twist, and is able … Continue reading A case for lease-based, utilitarian resource management on mobile devices

Boosted race trees for low energy classification

May 29, 2019May 26, 2019 ~ adriancolyer

Boosted race trees for low energy classification Tzimpragos et al., ASPLOS'19 We don’t talk about energy as often as we probably should on this blog, but it’s certainly true that our data centres and various IT systems consume an awful lot of it. So it’s interesting to see a paper using nano-Joules per prediction as … Continue reading Boosted race trees for low energy classification

CheriABI: enforcing valid pointer provenance and minimizing pointer privilege in the POSIX C run-time environment

May 28, 2019May 28, 2019 ~ adriancolyer ~ 1 Comment

CheriABI: enforcing valid pointer provenance and minimizing pointer privilege in the POSIX C run-time environment Davis et al., ASPLOS'19 Last week we saw the benefits of rethinking memory and pointer models at the hardware level when it came to object storage and compression (Zippads). CHERI also rethinks the way that pointers and memory work, but … Continue reading CheriABI: enforcing valid pointer provenance and minimizing pointer privilege in the POSIX C run-time environment

Compress objects, not cache lines: an object-based compressed memory hierarchy

May 24, 2019May 16, 2019 ~ adriancolyer ~ 14 Comments

Compress objects, not cache lines: an object-based compressed memory hierarchy Tsai & Sanchez, ASPLOS'19 Last time out we saw how Google have been able to save millions of dollars though memory compression enabled via zswap. One of the important attributes of their design was easy and rapid deployment across an existing fleet. Today’s paper introduces … Continue reading Compress objects, not cache lines: an object-based compressed memory hierarchy

Software-defined far memory in warehouse scale computers

May 22, 2019May 16, 2019 ~ adriancolyer ~ 13 Comments

Software-defined far memory in warehouse-scale computers Lagar-Cavilla et al., ASPLOS'19 Memory (DRAM) remains comparatively expensive, while in-memory computing demands are growing rapidly. This makes memory a critical factor in the total cost of ownership (TCO) of large compute clusters, or as Google like to call them "Warehouse-scale computers (WSCs)." This paper describes a "far memory" … Continue reading Software-defined far memory in warehouse scale computers

RPCValet: NI-driven tail-aware balancing of µs-scale RPCs

May 20, 2019May 16, 2019 ~ adriancolyer

RPCValet: NI-driven tail-aware balancing of µs-scale RPCs Daglis et al., ASPLOS'19 Last week we learned about the [increased tail-latency sensitivity of microservices based applications with high RPC fan-outs. Seer uses estimates of queue depths to mitigate latency spikes on the order of 10-100ms, in conjunction with a cluster manager. Today’s paper choice, RPCValet, operates at … Continue reading RPCValet: NI-driven tail-aware balancing of µs-scale RPCs

Understanding real-world concurrency bugs in Go

May 17, 2019May 9, 2019 ~ adriancolyer ~ 25 Comments

Understanding real-world concurrency bugs in Go Tu, Liu et al., ASPLOS'19 The design of a programming (or data) model not only makes certain problems easier (or harder) to solve, but also makes certain classes of bugs easier (or harder) to create, detect, and subsequently fix. Today’s paper choice studies concurrency mechanisms in Go. Before we … Continue reading Understanding real-world concurrency bugs in Go