Machine learning for dialog state tracking: a review Henderson MLSLP 2015 Today we turn our attention to the task of figuring out, potentially over multiple interactions with a bot, what it is the user is requesting the bot to do. This task goes by the name of Dialog State Tracking, and it’s something that Matthew … Continue reading Machine learning for dialog state tracking: a review
Tag: Machine Learning
Natural language understanding (almost) from scratch
Natural language understanding (almost) from scratch Collobert et al. Journal of Machine Learning Research 2011 Having spent much of last week looking at non-goal driven dialogue systems trained end-to-end, today it’s time to turn our attention to some of the building blocks of natural language processing that a chatbot can take advantage of if you’re … Continue reading Natural language understanding (almost) from scratch
A survey of available corpora for building data-driven dialogue systems
A survey of available corpora for building data-driven dialogue systems Serban et al. 2015 Bear with me, it’s more interesting than it sounds :). Yes, this (46-page) paper does include a catalogue of data sets with dialogues from different domains, but it also includes a high level survey of techniques that are used in building … Continue reading A survey of available corpora for building data-driven dialogue systems
Distributed representations of sentences and documents
Distributed representations of sentences and documents - Le & Mikolov, ICML 2014 We've previously looked at the amazing power of word vectors to learn distributed representation of words that manage to embody meaning. In today's paper, Le and Mikolov extend that approach to also compute distributed representations for sentences, paragraphs, and even entire documents. They … Continue reading Distributed representations of sentences and documents
GloVe: Global Vectors for Word Representation
GloVe: Global Vectors for Word Representation - Pennington et al. 2014 Yesterday we looked at some of the amazing properties of word vectors with word2vec. Pennington et al. argue that the online scanning approach used by word2vec is suboptimal since it doesn't fully exploit statistical information regarding word co-occurrences. They demonstrate a Global Vectors (GloVe) … Continue reading GloVe: Global Vectors for Word Representation
The amazing power of word vectors
For today's post, I've drawn material not just from one paper, but from five! The subject matter is 'word2vec' - the work of Mikolov et al. at Google on efficient vector representations of words (and what you can do with them). The papers are: Efficient Estimation of Word Representations in Vector Space - Mikolov et … Continue reading The amazing power of word vectors
CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy
CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy - Downlin et al. 2016 Fixed misspellings of homomorphic ! With the rise of machine learning, it's easy to imagine all sorts of cloud services that can process your data and make predictions of some kind (Machine Learning as a Service - MLAS). … Continue reading CryptoNets: Applying Neural Networks to Encrypted Data with High Throughput and Accuracy
Ad Click Prediction: A View from the Trenches
Ad Click Prediction: a View from the Trenches - McMahan et al. 2013 Yesterday we looked at a tour through the many ways technical debt can creep into machine learning systems. In that paper, the authors mention an automated feature management tool that since its adoption, "has regularly allowed a team at Google to safely … Continue reading Ad Click Prediction: A View from the Trenches
Machine Learning: The High-Interest Credit Card of Technical Debt
Machine Learning: The High-Interest Credit Card of Technical Debt - Sculley et al. 2014 Today's paper offers some pragmatic advice for the developers and maintainers of machine learning systems in production. It's easy to rush out version 1.0 the authors warn us, but making subsequent improvements can be unexpectedly difficult. You very much get the … Continue reading Machine Learning: The High-Interest Credit Card of Technical Debt
Chimera: Large-Scale Classification Using Machine Learning, Rules, and Crowdsourcing
Chimera: Large-Scale Classification Using Machine Learning, Rules, and Crowdsourcing - Sun et al. 2014 (WalmartLabs) Large-scale classification, where we need to classify hundreds of thousands or millions of items into thousands of classes, is becoming increasingly common in this age of Big Data... So far, however, very little has been published on how large-scale classification … Continue reading Chimera: Large-Scale Classification Using Machine Learning, Rules, and Crowdsourcing