Neural Turing Machines

March 9, 2016 ~ Adrian Colyer ~ 15 Comments

Neural Turing Machines - Graves et al. 2014 (Google DeepMind) A Neural Turing Machine is a Neural Network extended with a working memory, which as we'll see, gives it very impressive learning abilities. A Neural Turing Machine (NTM) architecture contains two basic components: a neural network controller and a memory bank. Like most neural networks, ... Continue Reading

Google’s Hybrid Approach to Research

March 4, 2016 ~ Adrian Colyer ~ Leave a comment

Google's Hybrid Approach to Research - Spector et al. 2012 Something a little different to close out the week, a paper describing how Google conduct research. It's a fascinating look at how they balance fundamental and applied research, how they integrate research into product teams, and how they measure the contribution of the research. I ... Continue Reading

Ad Click Prediction: A View from the Trenches

March 1, 2016 ~ Adrian Colyer ~ 2 Comments

Ad Click Prediction: a View from the Trenches - McMahan et al. 2013 Yesterday we looked at a tour through the many ways technical debt can creep into machine learning systems. In that paper, the authors mention an automated feature management tool that since its adoption, "has regularly allowed a team at Google to safely ... Continue Reading

Machine Learning: The High-Interest Credit Card of Technical Debt

February 29, 2016 ~ Adrian Colyer ~ 9 Comments

Machine Learning: The High-Interest Credit Card of Technical Debt - Sculley et al. 2014 Today's paper offers some pragmatic advice for the developers and maintainers of machine learning systems in production. It's easy to rush out version 1.0 the authors warn us, but making subsequent improvements can be unexpectedly difficult. You very much get the ... Continue Reading

Dapper, A Large Scale Distributed Systems Tracing Infrastructure

October 6, 2015 ~ Adrian Colyer ~ 7 Comments

Dapper, A Large Scale Distributed Systems Tracing Infrastructure - Sigelman et al. (Google) 2010 I'm going to dedicate the rest of this week to a series of papers addressing the important question of "how the hell do I know what is going on in my distributed system / cloud platform / microservices deployment?" As we'll ... Continue Reading

Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network

September 11, 2015 ~ Adrian Colyer ~ 1 Comment

Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network - Singh et. al (Google) 2015 Let's end the week with something completely different: a look at ten years and five generations of networking within Google's datacenters. Bandwidth demands within the datacenter are doubling every 12-15 months, even faster than the ... Continue Reading

MillWheel: Fault-Tolerant Stream Processing at Internet Scale

August 21, 2015 ~ Adrian Colyer ~ 4 Comments

MillWheel: Fault-Tolerant Stream Processing at Internet Scale - Akidau et al. (Google) 2013 Earlier this week we looked at the Google Cloud Dataflow model which is implemented on top of FlumeJava (for batch) and MillWheel (for streaming): We have implemented this model internally in FlumeJava, with MillWheel used as the underlying execution engine for streaming ... Continue Reading

The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing

August 18, 2015 ~ Adrian Colyer ~ 10 Comments

The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing - Akidau et al. (Google) - 2015 With thanks to William Vambenepe for suggesting this paper via twitter. Google Cloud Dataflow reached GA last week, and the team behind Cloud Dataflow have a paper accepted at VLDB'15 ... Continue Reading

Heracles: Improving Resource Efficiency at Scale

June 16, 2015 ~ Adrian Colyer ~ Leave a comment

Heracles: Improving Resource Efficiency at Scale - Lo et al. 2015 Until recently, scaling from Moore’s law provided higher compute per dollar with every server generation, allowing datacenters to scale without raising the cost. However, with several imminent challenges in technology scaling, alternate approaches are needed. Those approaches involve increasing server utilization, which is still ... Continue Reading

Pregel: A System for Large-Scale Graph Processing

May 26, 2015 ~ Adrian Colyer ~ 13 Comments

Pregel: A System for Large-Scale Graph Processing - Malewicz et al. (Google) 2010 "Many practical computing problems concern large graphs." Yesterday we looked at some of the models for understanding networks and graphs. Today's paper focuses on processing of graphs, especially the efficient processing of large graphs where large can mean billions of vertices and ... Continue Reading

the morning paper

a random walk through Computer Science research, by Adrian Colyer
Made delightfully fast by strattic

Google