Extending relational query processing with ML inference

February 21, 2020February 16, 2020 ~ adriancolyer ~ 1 Comment

Extending relational query processing with ML inference, Karanasos, CIDR'10 This paper provides a little more detail on the concrete work that Microsoft is doing to embed machine learning inference inside an RDBMS, as part of their vision for Enterprise Grade Machine Learning. The motivation is not that inference will perform better inside the database, but … Continue reading Extending relational query processing with ML inference

Cloudy with a high chance of DBMS: a 10-year prediction for enterprise-grade ML

February 19, 2020February 15, 2020 ~ adriancolyer ~ 8 Comments

Cloudy with a high chance of DBMS: a 10-year prediction for enterprise-grade ML, Agrawal et al., CIDR'20 "Cloudy with a high chance of DBMS" is a fascinating vision paper from a group of experts at Microsoft, looking at the transition of machine learning from being primarily the domain of large-scale, high-volume consumer applications to being … Continue reading Cloudy with a high chance of DBMS: a 10-year prediction for enterprise-grade ML

Migrating a privacy-safe information extraction system to a Software 2.0 design

February 17, 2020February 15, 2020 ~ adriancolyer ~ 5 Comments

Migrating a privacy-safe information extraction system to a software 2.0 design, Sheng, CIDR'20 This is a comparatively short (7 pages) but very interesting paper detailing the migration of a software system to a 'Software 2.0' design. Software 2.0, in case you missed it, is a term coined by Andrej Karpathy to describe software in which … Continue reading Migrating a privacy-safe information extraction system to a Software 2.0 design

POTS: protective optimization technologies

February 5, 2020January 31, 2020 ~ adriancolyer ~ 1 Comment

POTS: Protective optimization technologies, Kulynych, Overdorf et al., arXiv 2019 With thanks to @TedOnPrivacy for recommending this paper via Twitter. Last time out we looked at fairness in the context of machine learning systems, coming to the realisation that you can't define 'fair' solely from the perspective of an algorithm and the data it is … Continue reading POTS: protective optimization technologies

The measure and mismeasure of fairness: a critical review of fair machine learning

February 3, 2020January 30, 2020 ~ adriancolyer ~ 3 Comments

The measure and mismeasure of fairness: a critical review of fair machine learning, Corbett-Davies & Goel, arXiv 2018 With many thanks to Ben Fried and the ACM Queue editorial board for the paper recommendation. We've visited the topic of fairness in the context of machine learning several times on The Morning Paper (see e.g. [1]1, … Continue reading The measure and mismeasure of fairness: a critical review of fair machine learning

Programmatically interpretable reinforcement learning

January 15, 2020January 12, 2020 ~ adriancolyer ~ 15 Comments

Programmatically interpretable reinforcement learning, Verma et al., ICML 2018 Being able to trust (interpret, verify) a controller learned through reinforcement learning (RL) is one of the key challenges for real-world deployments of RL that we looked at earlier this week. It's also an essential requirement for agents in human-machine collaborations (i.e, all deployments at some … Continue reading Programmatically interpretable reinforcement learning

Challenges of real-world reinforcement learning

January 13, 2020January 12, 2020 ~ adriancolyer ~ 15 Comments

Challenges of real-world reinforcement learning, Dulac-Arnold et al., ICML'19 Last week we looked at some of the challenges inherent in automation and in building systems where humans and software agents collaborate. When we start talking about agents, policies, and modelling the environment, my thoughts naturally turn to reinforcement learning (RL). Today's paper choice sets out … Continue reading Challenges of real-world reinforcement learning

Optimized risk scores

November 1, 2019October 27, 2019 ~ adriancolyer ~ 2 Comments

Optimized risk scores Ustun & Rudin, KDD'17 On Monday we looked at the case for interpretable models, and in Wednesday’s edition of The Morning Paper we looked at CORELS which produces provably optimal rule lists for categorical assessments. Today we’ll be looking at RiskSLIM, which produces risk score models together with a proof of optimality. … Continue reading Optimized risk scores

Learning certifiably optimal rule lists for categorical data

October 30, 2019October 27, 2019 ~ adriancolyer ~ 3 Comments

Learning certifiably optimal rule lists for categorical data Angelino et al., JMLR 2018 Today we’re taking a closer look at CORELS, the Certifiably Optimal RulE ListS algorithm that we encountered in Rudin’s arguments for interpretable models earlier this week. We’ve been able to create rule lists (decision trees) for a long time, e.g. using CART, … Continue reading Learning certifiably optimal rule lists for categorical data

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

October 28, 2019November 24, 2019 ~ adriancolyer ~ 10 Comments

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead Rudin et al., arXiv 2019 With thanks to Glyn Normington for pointing out this paper to me. It’s pretty clear from the title alone what Cynthia Rudin would like us to do! The paper is a mix of technical … Continue reading Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead