Data Shapley: equitable valuation of data for machine learning Ghorbani & Zou et al., ICML'19 It’s incredibly difficult from afar to make sense of the almost 800 papers published at ICML this year! In practical terms I was reduced to looking at papers highlighted by others (e.g. via best paper awards), and scanning the list … Continue reading Data Shapley: equitable valuation of data for machine learning
Tag: Machine Learning
The machine learning subset of AI. Includes deep learning among other topics.
Software engineering for machine learning: a case study
Software engineering for machine learning: a case study Amershi et al., ICSE'19 Previously on The Morning Paper we’ve looked at the spread of machine learning through Facebook and Google and some of the lessons learned together with processes and tools to address the challenges arising. Today it’s the turn of Microsoft. More specifically, we’ll be … Continue reading Software engineering for machine learning: a case study
Machine learning systems are stuck in a rut
Machine learning systems are stuck in a rut Barham & Isard, HotOS'19 In this paper we argue that systems for numerical computing are stuck in a local basin of performance and programmability. Systems researchers are doing an excellent job improving the performance of 5-year old benchmarks, but gradually making it harder to explore innovative machine … Continue reading Machine learning systems are stuck in a rut
A case for managed and model-less inference serving
A case for managed and model-less inference serving Yadwadkar et al., HotOS'19 HotOS’19 is presenting me with something of a problem as there are so many interesting looking papers in the proceedings this year it’s going to be hard to cover them all! As a transition from the SysML papers we’ve been looking at recently, … Continue reading A case for managed and model-less inference serving
PyTorch-BigGraph: a large-scale graph embedding system
PyTorch-BigGraph: a large-scale graph embedding system Lerer et al., SysML'19 We looked at graph neural networks earlier this year, which operate directly over a graph structure. Via graph autoencoders or other means, another approach is to learn embeddings for the nodes in the graph, and then use these embeddings as inputs into a (regular) neural … Continue reading PyTorch-BigGraph: a large-scale graph embedding system
Towards federated learning at scale: system design
Towards federated learning at scale: system design Bonawitz et al., SysML 2019 This is a high level paper describing Google’s production system for federated learning. One of the most interesting things to me here is simply to know that Google are working on this, have a first version in production working with tens of millions … Continue reading Towards federated learning at scale: system design
Data validation for machine learning
Data validation for machine learning Breck et al., SysML'19 Last time out we looked at continuous integration testing of machine learning models, but arguably even more important than the model is the data. Garbage in, garbage out. In this paper we focus on the problem of validation the input data fed to ML pipelines. The … Continue reading Data validation for machine learning
Continuous integration of machine learning models with ease.ml/ci
Continuous integration of machine learning models with ease.ml/ci: towards a rigorous yet practical treatment Renggli et al., SysML'19 Developing machine learning models is no different from developing traditional software, in the sense that it is also a full life cycle involving design, implementation, tuning, testing, and deployment. As machine learning models are used in more … Continue reading Continuous integration of machine learning models with ease.ml/ci
Boosted race trees for low energy classification
Boosted race trees for low energy classification Tzimpragos et al., ASPLOS'19 We don’t talk about energy as often as we probably should on this blog, but it’s certainly true that our data centres and various IT systems consume an awful lot of it. So it’s interesting to see a paper using nano-Joules per prediction as … Continue reading Boosted race trees for low energy classification
The why and how of nonnegative matrix factorization
The why and how of nonnegative matrix factorization Gillis, arXiv 2014 from: ‘Regularization, Optimization, Kernels, and Support Vector Machines.’ Last week we looked at the paper ‘Beyond news content,’ which made heavy use of nonnegative matrix factorisation. Today we’ll be looking at that technique in a little more detail. As the name suggests, ‘The Why … Continue reading The why and how of nonnegative matrix factorization