HyperLogLog in Practice: Algorithmic Engineering of a State of the Art Cardinality Estimation Algorithm

HyperLogLog in Practice: Algorithmic Engineering of a State of the Art Cardinality Estimation Algorithm - Heule et al. 2013 Continuing on the theme of approximations from yesterday, today's paper looks at what must be one of the best known approximate data structures after the Bloom Filter, HyperLogLog. It's HyperLogLog with a twist though - a ... Continue Reading

ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks

ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging - Mohan et al. 1992 This is part 5 of a 7 part series on (database) 'Techniques Everyone Should Know.' From Peter Bailis' introduction to this paper in chapter 3 of the Redbook: Another major problem in transaction processing is maintaining ... Continue Reading

Hashed and Hierarchical Timing Wheels: Data Structures for the Efficient Implementation of a Timer Facility

Hashed and Hierarchical Timing Wheels: Data Structures for the Efficient Implementation of a Timer Facility - Varghese & Lauck 1987 Yashiro Matsuda recently wrote a blog post describing Apache Kafka's use of Hierarchical Timing Wheels to keep track of large numbers of outstanding requests. In the Kafka use case, each request lives in a 'purgatory' ... Continue Reading

Read-Log-Update: A Lightweight Synchronization Mechanism for Concurrent Programming

Read-Log-Update: A Lightweight Synchronization Mechanism for Concurrent Programming - Matveev et al. 2015 An important paradigm in concurrent data structure scalability is to support read-only traversals: sequences of reads that execute without any synchronization (and hence require no memory fences and generate no contention). The gain from such unsynchronized traversals is significant because they account ... Continue Reading