Moving Fast with Software Verification - Calcagno et al. 2015 This is a story of transporting ideas from recent theoretical research in reasoning about programs into the fast-moving engineering culture of Facebook. The context is that most of the authors landed at Facebook in September of 2013, when we brought the INFER static analyser with … Continue reading Moving Fast with Software Verification
Tag: Facebook
Fail at Scale & Controlling Queue Delay
Controlling Queue Delay - Nichols & Van Jacobsen, 2012, and Fail at Scale - Maurer, 2015 Fail at Scale (Maurer) Ben Maurer recently wrote a great article for ACM Queue on how Facebook achieves reliability in the face of rapid change: To keep Facebook reliable in the face of rapid change we study common patterns … Continue reading Fail at Scale & Controlling Queue Delay
Existential Consistency: Measuring and Understanding Consistency at Facebook
Existential Consistency: Measuring and Understanding Consistency at Facebook - Lu et al. 2015 At the core of this paper is an analysis of the number of anomalies seen in Facebook's production system for clients of TAO, which is impressively low under normal operation (0.0004%) - to interpret that number of course, we'll have to dig … Continue reading Existential Consistency: Measuring and Understanding Consistency at Facebook
Holistic Configuration Management at Facebook
Holistic Configuration Management at Facebook - Tang et al. (Facebook) 2015 This paper gives a comprehensive description of the use cases, design, implementation, and usage statistics of a suite of tools that manage Facebook’s configuration end-to-end, including the frontend products, backend systems, and mobile apps. The configuration for Facebook's site is updated thousands of times … Continue reading Holistic Configuration Management at Facebook
The Mystery Machine: End-to-end performance analysis of large-scale internet services
The Mystery Machine: End-to-end performance analysis of large-scale internet services - Chow et al. 2014 Google's Dapper paper is very well known, but Facebook's Mystery Machine seems to be much less well known - and that's a shame because I have a hunch the approach could be very relevant to many people. Current debugging and … Continue reading The Mystery Machine: End-to-end performance analysis of large-scale internet services
Fast Database Restarts at Facebook
Fast Database Restarts at Facebook - Goel et al. 2014 In security, you're only as secure as your weakest link in the chain. When it comes to agility, you're only as fast as your slowest link in the chain. Updating and evolving a stateless middle tier is usually pretty quick, but what if you need … Continue reading Fast Database Restarts at Facebook
TAO: Facebook’s Distributed Data Store for the Social Graph
TAO: Facebook's Distributed Data Store for the Social Graph Bronson et al. (Facebook) 2013 A single Facebook page may aggregate and filter hundreds of items from the social graph. We present each user with content tailored to them, and we filter every item with privacy checks that take into account the current viewer. This extreme … Continue reading TAO: Facebook’s Distributed Data Store for the Social Graph
Wormhole: Reliable pub-sub to support Geo-Replicated Internet Services
Wormhole: Reliable pub-sub to support Geo-Replicated Internet Services - Sharma et al. 2015 At Facebook, lots of applications are interested in data being written to Facebook's data stores. Having each of these applications poll the data stores of interest would be untenable, so Facebook built a pub-sub system to identify updates and transmit notifications to … Continue reading Wormhole: Reliable pub-sub to support Geo-Replicated Internet Services
RIPQ: Advanced photo caching on flash for Facebook
RIPQ: Advanced Photo Caching on Flash for Facebook - Tang et al. 2015 It's three for the price of one with this paper: we get to deepen our understanding of the characteristics of flash, examine a number of priority queue and caching algorithms, and get a glimpse into what's behind an important part of Facebook's … Continue reading RIPQ: Advanced photo caching on flash for Facebook
f4: Facebook’s warm BLOB storage system
f4: Facebook's warm BLOB storage system - Muralidhar et al. 2014 This is a story of system engineering trade-offs, a design informed by data analysis, and hard-won experience. It's the story of how Facebook implemented a tiered storage solution for BLOBs and introduced per data class (temperature) replication factor, latency, and time-to-recovery tuning. If you're … Continue reading f4: Facebook’s warm BLOB storage system