Slicer: Auto-sharding for datacenter applications

Slicer: Auto-sharding for datacenter applications Adya et al. (Google)  OSDI 2016 Another piece of Google's back-end infrastructure is revealed in this paper, ready to spawn some new open source implementations of the same ideas no doubt. Slicer is a general purpose sharding service. I normally think of sharding as something that happens within a (typically … Continue reading Slicer: Auto-sharding for datacenter applications

Smart Reply: Automated response suggestion for email

Smart Reply: Automated response suggestion for email Kannan, Kaufman, Karach, et al. KDD 2016 I’m sure you’ve come across (or at least heard of) Google Inbox’s smart reply feature for mobile email by now. It’s currently used for 10% of all mobile replies, which must equate to a very large number of messages per day. … Continue reading Smart Reply: Automated response suggestion for email

Mastering the game of Go with deep neural networks and tree search

Mastering the Game of Go with Deep Neural Networks and Tree Search Silver, Huang et al., Nature vol 529, 2016 Pretty much everyone has heard about AlphaGo’s tremendous Go playing success beating the European champion by 5 games to 0. In all the excitement at the time, less was written about how AlphaGo actually worked … Continue reading Mastering the game of Go with deep neural networks and tree search

Deep neural networks for YouTube recommendations

Deep Neural Networks for YouTube Recommendations Covington et al, RecSys '16 The lovely people at InfoQ have been very kind to The Morning Paper, producing beautiful looking "Quarterly Editions." Today's paper choice was first highlighted to me by InfoQ's very own Charles Humble. In it, Google describe how they overhauled the YouTube recommendation system using … Continue reading Deep neural networks for YouTube recommendations

Distributed representations of sentences and documents

Distributed representations of sentences and documents - Le & Mikolov, ICML 2014 We've previously looked at the amazing power of word vectors to learn distributed representation of words that manage to embody meaning. In today's paper, Le and Mikolov extend that approach to also compute distributed representations for sentences, paragraphs, and even entire documents. They … Continue reading Distributed representations of sentences and documents

Maglev: A Fast and Reliable Software Network Load Balancer

Maglev: A Fast and Reliable Software Network Load Balancer - Eisenbud et al. 2016 Maglev is Google's software load balancer used within all their datacenters. It offers greater scalability and availability than hardware load balancers, enables quick iteration, and is much easier to upgrade. Maglev is a just another distributed system running on the commodity … Continue reading Maglev: A Fast and Reliable Software Network Load Balancer

HyperLogLog in Practice: Algorithmic Engineering of a State of the Art Cardinality Estimation Algorithm

HyperLogLog in Practice: Algorithmic Engineering of a State of the Art Cardinality Estimation Algorithm - Heule et al. 2013 Continuing on the theme of approximations from yesterday, today's paper looks at what must be one of the best known approximate data structures after the Bloom Filter, HyperLogLog. It's HyperLogLog with a twist though - a … Continue reading HyperLogLog in Practice: Algorithmic Engineering of a State of the Art Cardinality Estimation Algorithm

Secrets, Lies, and Account Recovery: Lessons from the Use of Personal Knowledge Questions at Google

Secrets, Lies, and Account Recovery: Lessons from the Use of Personal Knowledge Questions at Google - Bonneau et al. 2015 What was your mother's maiden name? What was your city of birth? What was the name of your first school? I don't know about you, but I always groan inwardly when a website asks such … Continue reading Secrets, Lies, and Account Recovery: Lessons from the Use of Personal Knowledge Questions at Google