Learning to learn by gradient descent by gradient descent

January 4, 2017July 31, 2017 ~ adriancolyer ~ 3 Comments

Learning to learn by gradient descent by gradient descent Andrychowicz et al. NIPS 2016 One of the things that strikes me when I read these NIPS papers is just how short some of them are - between the introduction and the evaluation sections you might find only one or two pages! A general form is … Continue reading Learning to learn by gradient descent by gradient descent

Matching networks for one shot learning

January 3, 2017July 31, 2017 ~ adriancolyer ~ 9 Comments

Matching networks for one shot learning Vinyals et al. (Google DeepMind), NIPS 2016 Yesterday we saw a neural network that can learn basic Newtonian physics. On reflection that's not totally surprising since we know that deep networks are very good at learning functions of the kind that describe our natural world. Alongside an intuitive understanding … Continue reading Matching networks for one shot learning

Interaction networks for learning about objects, relations and physics

January 2, 2017July 31, 2017 ~ adriancolyer ~ 8 Comments

Interaction networks for learning about objects, relations and physics Google DeepMind, NIPS 2016 Welcome back! There were so many great papers from OSDI '16 to cover at the end of last year that I didn't have a chance to get to NIPS. I'm kicking off this year therefore with a few of the Google DeepMind … Continue reading Interaction networks for learning about objects, relations and physics

GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server

April 27, 2016July 27, 2017 ~ adriancolyer ~ 2 Comments

GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server - Cui et al. 2016 (EuroSys 2016) We know that deep learning is well suited to GPUs since it has inherent parallelism. But so far this has mostly been limited to either a single GPU (e.g. using Caffe) or to specially built distributed … Continue reading GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server

The amazing power of word vectors

April 21, 2016July 27, 2017 ~ adriancolyer ~ 183 Comments

For today's post, I've drawn material not just from one paper, but from five! The subject matter is 'word2vec' - the work of Mikolov et al. at Google on efficient vector representations of words (and what you can do with them). The papers are: Efficient Estimation of Word Representations in Vector Space - Mikolov et … Continue reading The amazing power of word vectors

ImageNet Classification with Deep Convolutional Neural Networks

April 20, 2016July 27, 2017 ~ adriancolyer ~ 8 Comments

ImageNet Classification with Deep Convolutional Neural Networks - Krizhevsky et al. 2012 Like the large-vocabulary speech recognition paper we looked at yesterday, today's paper has also been described as a landmark paper in the history of deep learning. It's also a surprisingly easy read! The ImageNet dataset contains over 15 million labeled high-resolution images of … Continue reading ImageNet Classification with Deep Convolutional Neural Networks

Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition

April 19, 2016July 27, 2017 ~ adriancolyer ~ 6 Comments

Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition - Dahl et al. 2011 The title may be a bit of a mouthful, but this paper is often cited as a watershed moment for deep learning and speech recognition. It represents the first application of deep neural networks for large vocabulary speech recognition (LVSR), and … Continue reading Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition

Deep Learning in Neural Networks: An Overview

April 18, 2016July 27, 2017 ~ adriancolyer ~ 10 Comments

Deep Learning in Neural Networks: An Overview - Schmidhuber 2014 What a wonderful treasure trove this paper is! Schmidhuber provides all the background you need to gain an overview of deep learning (as of 2014) and how we got there through the preceding decades. Starting from recent DL results, I tried to trace back the … Continue reading Deep Learning in Neural Networks: An Overview

Mining and Summarizing Customer Reviews

August 28, 2015July 26, 2017 ~ adriancolyer

Mining and Summarizing Customer Reviews - Hu and Liu 2004 This is the third of the three 'test-of-time' award winners from KDD'15. From the awards page: The paper introduces the problem of summarizing customer reviews and decomposes the problem into the three steps of (1) mining product features (aspects), (2) identifying opinion sentences and their … Continue reading Mining and Summarizing Customer Reviews

Optimizing Search Engines using Clickthrough Data

August 27, 2015July 26, 2017 ~ adriancolyer

Optimizing Search Engines using Clickthrough Data - Joachims, 2002 Today's choice is another KDD 'test-of-time' winner. The paper introduced the problem of ranking documents w.r.t. a query using not explicit user feedback but implicit user feedback in the form of clickthrough data. The author presented the Ranking SVM Algorithm to solve the proposed ranking problem. … Continue reading Optimizing Search Engines using Clickthrough Data