Sequence to sequence learning with neural networks

June 2, 2016 ~ Adrian Colyer ~ 24 Comments

Sequence to sequence learning with neural networks Sutskever et al. NIPS, 2014 Yesterday we looked at paragraph vectors which extend the distributed word vectors approach to learn a distributed representation of a sentence, paragraph, or document. Today's paper tackles what must be one of the sternest tests of all when it comes to assessing how ... Continue Reading

GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server

April 27, 2016 ~ Adrian Colyer ~ 2 Comments

GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server - Cui et al. 2016 (EuroSys 2016) We know that deep learning is well suited to GPUs since it has inherent parallelism. But so far this has mostly been limited to either a single GPU (e.g. using Caffe) or to specially built distributed ... Continue Reading

ImageNet Classification with Deep Convolutional Neural Networks

April 20, 2016 ~ Adrian Colyer ~ 8 Comments

ImageNet Classification with Deep Convolutional Neural Networks - Krizhevsky et al. 2012 Like the large-vocabulary speech recognition paper we looked at yesterday, today's paper has also been described as a landmark paper in the history of deep learning. It's also a surprisingly easy read! The ImageNet dataset contains over 15 million labeled high-resolution images of ... Continue Reading

Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition

April 19, 2016 ~ Adrian Colyer ~ 6 Comments

Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition - Dahl et al. 2011 The title may be a bit of a mouthful, but this paper is often cited as a watershed moment for deep learning and speech recognition. It represents the first application of deep neural networks for large vocabulary speech recognition (LVSR), and ... Continue Reading

Deep Learning in Neural Networks: An Overview

April 18, 2016 ~ Adrian Colyer ~ 10 Comments

Deep Learning in Neural Networks: An Overview - Schmidhuber 2014 What a wonderful treasure trove this paper is! Schmidhuber provides all the background you need to gain an overview of deep learning (as of 2014) and how we got there through the preceding decades. Starting from recent DL results, I tried to trace back the ... Continue Reading

Distributed TensorFlow with MPI

March 18, 2016 ~ Adrian Colyer ~ 4 Comments

Distributed TensorFlow with MPI - Vishnu et al. 2016 A short early release paper to close out the week this week, which looks at how to support machine learning and data mining (MLDM) with Google's TensorFlow in a distributed setting. The paper also contains some good background on TensorFlow itself as well as MPI - ... Continue Reading

Strategic Dialogue Management via Deep Reinforcement Learning

March 11, 2016 ~ Adrian Colyer ~ 1 Comment

Strategic Dialogue Management via Deep Reinforcement Learning - Cuayahuitl et al. 2015 If computers learning to play Atari arcade games by themselves isn't really your thing, perhaps you're more into board games? How about a Deep Reinforcement Learning system that learns how to trade effectively in Settlers of Catan! Again, we're not talking about a ... Continue Reading

Memory Networks

March 10, 2016 ~ Adrian Colyer ~ 14 Comments

Memory Networks Weston et al. 2015 As with the Neural Turing Machine that we look at yesterday, this paper looks at extending machine learning models with a memory component. The Neural Turing Machine work was developed at Google by the DeepMind team, today's paper on Memory Networks was developed by the Facebook AI Research group. ... Continue Reading

Neural Turing Machines

March 9, 2016 ~ Adrian Colyer ~ 15 Comments

Neural Turing Machines - Graves et al. 2014 (Google DeepMind) A Neural Turing Machine is a Neural Network extended with a working memory, which as we'll see, gives it very impressive learning abilities. A Neural Turing Machine (NTM) architecture contains two basic components: a neural network controller and a memory bank. Like most neural networks, ... Continue Reading

Graying the Black Box: Understanding DQNs

March 2, 2016 ~ Adrian Colyer ~ 13 Comments

Graying the Black Box: Understanding DQNs - Zahavy et al. 2016 It's hard to escape the excitement around deep learning these days. Over the last couple of days we looked at some of the lessons learned by Google's machine learning systems teams, including the need to develop ways of getting insights into the predictions made ... Continue Reading

the morning paper

a random walk through Computer Science research, by Adrian Colyer
Made delightfully fast by strattic

Deep Learning