Slicer: Auto-sharding for datacenter applications Adya et al. (Google) OSDI 2016 Another piece of Google's back-end infrastructure is revealed in this paper, ready to spawn some new open source implementations of the same ideas no doubt. Slicer is a general purpose sharding service. I normally think of sharding as something that happens within a (typically … Continue reading Slicer: Auto-sharding for datacenter applications
Tag: Google
Google technology and systems.
Smart Reply: Automated response suggestion for email
Smart Reply: Automated response suggestion for email Kannan, Kaufman, Karach, et al. KDD 2016 I’m sure you’ve come across (or at least heard of) Google Inbox’s smart reply feature for mobile email by now. It’s currently used for 10% of all mobile replies, which must equate to a very large number of messages per day. … Continue reading Smart Reply: Automated response suggestion for email
Mastering the game of Go with deep neural networks and tree search
Mastering the Game of Go with Deep Neural Networks and Tree Search Silver, Huang et al., Nature vol 529, 2016 Pretty much everyone has heard about AlphaGo’s tremendous Go playing success beating the European champion by 5 games to 0. In all the excitement at the time, less was written about how AlphaGo actually worked … Continue reading Mastering the game of Go with deep neural networks and tree search
Deep neural networks for YouTube recommendations
Deep Neural Networks for YouTube Recommendations Covington et al, RecSys '16 The lovely people at InfoQ have been very kind to The Morning Paper, producing beautiful looking "Quarterly Editions." Today's paper choice was first highlighted to me by InfoQ's very own Charles Humble. In it, Google describe how they overhauled the YouTube recommendation system using … Continue reading Deep neural networks for YouTube recommendations
Goods: organizing Google’s datasets
Goods: organizing Google’s datasets Havely et al. SIGMOD 2016 You can (try and) build a data cathedral. Or you can build a data bazaar. By data cathedral I’m referring to a centralised Enterprise Data Management solution that everyone in the company buys into and pays homage to, making a pilgrimage to the EDM every time … Continue reading Goods: organizing Google’s datasets
Distributed representations of sentences and documents
Distributed representations of sentences and documents - Le & Mikolov, ICML 2014 We've previously looked at the amazing power of word vectors to learn distributed representation of words that manage to embody meaning. In today's paper, Le and Mikolov extend that approach to also compute distributed representations for sentences, paragraphs, and even entire documents. They … Continue reading Distributed representations of sentences and documents
The amazing power of word vectors
For today's post, I've drawn material not just from one paper, but from five! The subject matter is 'word2vec' - the work of Mikolov et al. at Google on efficient vector representations of words (and what you can do with them). The papers are: Efficient Estimation of Word Representations in Vector Space - Mikolov et … Continue reading The amazing power of word vectors
Maglev: A Fast and Reliable Software Network Load Balancer
Maglev: A Fast and Reliable Software Network Load Balancer - Eisenbud et al. 2016 Maglev is Google's software load balancer used within all their datacenters. It offers greater scalability and availability than hardware load balancers, enables quick iteration, and is much easier to upgrade. Maglev is a just another distributed system running on the commodity … Continue reading Maglev: A Fast and Reliable Software Network Load Balancer
HyperLogLog in Practice: Algorithmic Engineering of a State of the Art Cardinality Estimation Algorithm
HyperLogLog in Practice: Algorithmic Engineering of a State of the Art Cardinality Estimation Algorithm - Heule et al. 2013 Continuing on the theme of approximations from yesterday, today's paper looks at what must be one of the best known approximate data structures after the Bloom Filter, HyperLogLog. It's HyperLogLog with a twist though - a … Continue reading HyperLogLog in Practice: Algorithmic Engineering of a State of the Art Cardinality Estimation Algorithm
Secrets, Lies, and Account Recovery: Lessons from the Use of Personal Knowledge Questions at Google
Secrets, Lies, and Account Recovery: Lessons from the Use of Personal Knowledge Questions at Google - Bonneau et al. 2015 What was your mother's maiden name? What was your city of birth? What was the name of your first school? I don't know about you, but I always groan inwardly when a website asks such … Continue reading Secrets, Lies, and Account Recovery: Lessons from the Use of Personal Knowledge Questions at Google