Data Shapley: equitable valuation of data for machine learning

Data Shapley: equitable valuation of data for machine learning Ghorbani & Zou et al., ICML'19 It’s incredibly difficult from afar to make sense of the almost 800 papers published at ICML this year! In practical terms I was reduced to looking at papers highlighted by others (e.g. via best paper awards), and scanning the list … Continue reading Data Shapley: equitable valuation of data for machine learning

Software engineering for machine learning: a case study

Software engineering for machine learning: a case study Amershi et al., ICSE'19 Previously on The Morning Paper we’ve looked at the spread of machine learning through Facebook and Google and some of the lessons learned together with processes and tools to address the challenges arising. Today it’s the turn of Microsoft. More specifically, we’ll be … Continue reading Software engineering for machine learning: a case study

Machine learning systems are stuck in a rut

Machine learning systems are stuck in a rut Barham & Isard, HotOS'19 In this paper we argue that systems for numerical computing are stuck in a local basin of performance and programmability. Systems researchers are doing an excellent job improving the performance of 5-year old benchmarks, but gradually making it harder to explore innovative machine … Continue reading Machine learning systems are stuck in a rut

A case for managed and model-less inference serving

A case for managed and model-less inference serving Yadwadkar et al., HotOS'19 HotOS’19 is presenting me with something of a problem as there are so many interesting looking papers in the proceedings this year it’s going to be hard to cover them all! As a transition from the SysML papers we’ve been looking at recently, … Continue reading A case for managed and model-less inference serving

PyTorch-BigGraph: a large-scale graph embedding system

PyTorch-BigGraph: a large-scale graph embedding system Lerer et al., SysML'19 We looked at graph neural networks earlier this year, which operate directly over a graph structure. Via graph autoencoders or other means, another approach is to learn embeddings for the nodes in the graph, and then use these embeddings as inputs into a (regular) neural … Continue reading PyTorch-BigGraph: a large-scale graph embedding system

Towards federated learning at scale: system design

Towards federated learning at scale: system design Bonawitz et al., SysML 2019 This is a high level paper describing Google’s production system for federated learning. One of the most interesting things to me here is simply to know that Google are working on this, have a first version in production working with tens of millions … Continue reading Towards federated learning at scale: system design

Continuous integration of machine learning models with ease.ml/ci

Continuous integration of machine learning models with ease.ml/ci: towards a rigorous yet practical treatment Renggli et al., SysML'19 Developing machine learning models is no different from developing traditional software, in the sense that it is also a full life cycle involving design, implementation, tuning, testing, and deployment. As machine learning models are used in more … Continue reading Continuous integration of machine learning models with ease.ml/ci

The why and how of nonnegative matrix factorization

The why and how of nonnegative matrix factorization Gillis, arXiv 2014 from: ‘Regularization, Optimization, Kernels, and Support Vector Machines.’ Last week we looked at the paper ‘Beyond news content,’ which made heavy use of nonnegative matrix factorisation. Today we’ll be looking at that technique in a little more detail. As the name suggests, ‘The Why … Continue reading The why and how of nonnegative matrix factorization