Learning to learn by gradient descent by gradient descent Andrychowicz et al. NIPS 2016 One of the things that strikes me when I read these NIPS papers is just how short some of them are - between the introduction and the evaluation sections you might find only one or two pages! A general form is … Continue reading Learning to learn by gradient descent by gradient descent
Category: Machine Learning
The machine learning subset of AI. Includes deep learning among other topics.
Matching networks for one shot learning
Matching networks for one shot learning Vinyals et al. (Google DeepMind), NIPS 2016 Yesterday we saw a neural network that can learn basic Newtonian physics. On reflection that's not totally surprising since we know that deep networks are very good at learning functions of the kind that describe our natural world. Alongside an intuitive understanding … Continue reading Matching networks for one shot learning
Interaction networks for learning about objects, relations and physics
Interaction networks for learning about objects, relations and physics Google DeepMind, NIPS 2016 Welcome back! There were so many great papers from OSDI '16 to cover at the end of last year that I didn't have a chance to get to NIPS. I'm kicking off this year therefore with a few of the Google DeepMind … Continue reading Interaction networks for learning about objects, relations and physics
GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server
GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server - Cui et al. 2016 (EuroSys 2016) We know that deep learning is well suited to GPUs since it has inherent parallelism. But so far this has mostly been limited to either a single GPU (e.g. using Caffe) or to specially built distributed … Continue reading GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server
The amazing power of word vectors
For today's post, I've drawn material not just from one paper, but from five! The subject matter is 'word2vec' - the work of Mikolov et al. at Google on efficient vector representations of words (and what you can do with them). The papers are: Efficient Estimation of Word Representations in Vector Space - Mikolov et … Continue reading The amazing power of word vectors
ImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural Networks - Krizhevsky et al. 2012 Like the large-vocabulary speech recognition paper we looked at yesterday, today's paper has also been described as a landmark paper in the history of deep learning. It's also a surprisingly easy read! The ImageNet dataset contains over 15 million labeled high-resolution images of … Continue reading ImageNet Classification with Deep Convolutional Neural Networks
Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition
Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition - Dahl et al. 2011 The title may be a bit of a mouthful, but this paper is often cited as a watershed moment for deep learning and speech recognition. It represents the first application of deep neural networks for large vocabulary speech recognition (LVSR), and … Continue reading Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition
Deep Learning in Neural Networks: An Overview
Deep Learning in Neural Networks: An Overview - Schmidhuber 2014 What a wonderful treasure trove this paper is! Schmidhuber provides all the background you need to gain an overview of deep learning (as of 2014) and how we got there through the preceding decades. Starting from recent DL results, I tried to trace back the … Continue reading Deep Learning in Neural Networks: An Overview
Mining and Summarizing Customer Reviews
Mining and Summarizing Customer Reviews - Hu and Liu 2004 This is the third of the three 'test-of-time' award winners from KDD'15. From the awards page: The paper introduces the problem of summarizing customer reviews and decomposes the problem into the three steps of (1) mining product features (aspects), (2) identifying opinion sentences and their … Continue reading Mining and Summarizing Customer Reviews
Optimizing Search Engines using Clickthrough Data
Optimizing Search Engines using Clickthrough Data - Joachims, 2002 Today's choice is another KDD 'test-of-time' winner. The paper introduced the problem of ranking documents w.r.t. a query using not explicit user feedback but implicit user feedback in the form of clickthrough data. The author presented the Ranking SVM Algorithm to solve the proposed ranking problem. … Continue reading Optimizing Search Engines using Clickthrough Data