Delta State Replicated Data Types

April 25, 2016July 27, 2017 ~ adriancolyer

Delta State Replicated Data Types - Almeida et al. 2016 You know when you want to use CRDTs for their convergence properties, but the amount of state you're required to pass around gets out of hand? In this paper, Almeida et al. show how to retain the advantages of state-based CRDTs, but with much smaller … Continue reading Delta State Replicated Data Types

FairRide: Near-Optimal, Fair Cache Sharing

March 23, 2016July 27, 2017 ~ adriancolyer ~ 3 Comments

FairRide: Near-Optimal, Fair Cache Sharing - Pu et al. 2016 Yesterday we looked at a near-optimal packet scheduling algorithm (LSTF), today it's the turn of a near-optimal fair cache sharing algorithm. We're concerned with the scenario where a single cache resource is shared by multiple applications / users. Ideally we'd like three properties to hold: … Continue reading FairRide: Near-Optimal, Fair Cache Sharing

HyperLogLog in Practice: Algorithmic Engineering of a State of the Art Cardinality Estimation Algorithm

March 17, 2016July 27, 2017 ~ adriancolyer ~ 3 Comments

HyperLogLog in Practice: Algorithmic Engineering of a State of the Art Cardinality Estimation Algorithm - Heule et al. 2013 Continuing on the theme of approximations from yesterday, today's paper looks at what must be one of the best known approximate data structures after the Bloom Filter, HyperLogLog. It's HyperLogLog with a twist though - a … Continue reading HyperLogLog in Practice: Algorithmic Engineering of a State of the Art Cardinality Estimation Algorithm

MacroBase: Analytic Monitoring for the Internet of Things

March 16, 2016July 27, 2017 ~ adriancolyer ~ 5 Comments

MacroBase: Analytic Monitoring for the Internet of Things - Bailis et al. 2016 It looks like Peter Alvaro is not the only one to be doing some industrial collaboration recently! MacroBase is the result of Peter Bailis' collaboration with Cambridge Mobile Telematics (CMT), an IoT company. The topic at hand is analytic monitoring - detecting … Continue reading MacroBase: Analytic Monitoring for the Internet of Things

ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks

January 8, 2016July 27, 2017 ~ adriancolyer ~ 10 Comments

ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging - Mohan et al. 1992 This is part 5 of a 7 part series on (database) 'Techniques Everyone Should Know.' From Peter Bailis' introduction to this paper in chapter 3 of the Redbook: Another major problem in transaction processing is maintaining … Continue reading ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks

Optimizing Hash-Array Mapped Tries for Fast and Lean Immutable JVM Collections

November 27, 2015July 27, 2017 ~ adriancolyer ~ 9 Comments

Optimizing Hash-Array Mapped Tries for Fast and Lean Immutable JVM Collections - Steinforder & Vinju, 2015 You'd think that the collection classes in modern JVM-based languages would be highly efficient at this point in time - and indeed they are. But the wonderful thing is that there always seems to be room for improvement. Today's … Continue reading Optimizing Hash-Array Mapped Tries for Fast and Lean Immutable JVM Collections

GD-Wheel: A Cost-Aware Replacement Policy for Key-Value Stores

November 24, 2015July 27, 2017 ~ adriancolyer

GD-Wheel: A Cost-Aware Replacement Policy for Key-Value Stores - Li & Cox 2013 One of the wonderful things about reading papers and being exposed to lots of different problems and their solutions is that you never know when an idea might resurface and be useful in a new context or challenge you are facing. Yesterday … Continue reading GD-Wheel: A Cost-Aware Replacement Policy for Key-Value Stores

Hashed and Hierarchical Timing Wheels: Data Structures for the Efficient Implementation of a Timer Facility

November 23, 2015July 27, 2017 ~ adriancolyer ~ 4 Comments

Hashed and Hierarchical Timing Wheels: Data Structures for the Efficient Implementation of a Timer Facility - Varghese & Lauck 1987 Yashiro Matsuda recently wrote a blog post describing Apache Kafka's use of Hierarchical Timing Wheels to keep track of large numbers of outstanding requests. In the Kafka use case, each request lives in a 'purgatory' … Continue reading Hashed and Hierarchical Timing Wheels: Data Structures for the Efficient Implementation of a Timer Facility

Fail at Scale & Controlling Queue Delay

November 19, 2015July 27, 2017 ~ adriancolyer ~ 4 Comments

Controlling Queue Delay - Nichols & Van Jacobsen, 2012, and Fail at Scale - Maurer, 2015 Fail at Scale (Maurer) Ben Maurer recently wrote a great article for ACM Queue on how Facebook achieves reliability in the face of rapid change: To keep Facebook reliable in the face of rapid change we study common patterns … Continue reading Fail at Scale & Controlling Queue Delay

Read-Log-Update: A Lightweight Synchronization Mechanism for Concurrent Programming

October 27, 2015July 27, 2017 ~ adriancolyer

Read-Log-Update: A Lightweight Synchronization Mechanism for Concurrent Programming - Matveev et al. 2015 An important paradigm in concurrent data structure scalability is to support read-only traversals: sequences of reads that execute without any synchronization (and hence require no memory fences and generate no contention). The gain from such unsynchronized traversals is significant because they account … Continue reading Read-Log-Update: A Lightweight Synchronization Mechanism for Concurrent Programming