The Log-Structured Merge-Tree (LSM Tree) - O'Neil et al. '96. Log-Structured Merge is an important technique used in many modern data stores (for example, BigTable, Cassandra, HBase, Riak, ...). Suppose you have a hierarchy of storage options for data - for example, RAM, SSDs, Spinning disks, with different price/performance characteristics. Furthermore, you have a large … Continue reading The Log-Structured Merge-Tree (LSM Tree)
Tag: Algorithms and data structures
Handy algorithms and data structures – e.g., for approximations
The Boom Hierarchy
The Boom Hierarchy - Bunkenberg In honour of the CodeMesh conference this week, and as recommended in a recent tweet by Eric Meijer, today's paper gives insight on some of our most fundamental data structures and the operations on them. The Boom Hierarchy is the family of data structures tree, list, bag, and set, to … Continue reading The Boom Hierarchy
Route planning in transportation networks
Route planning in transportation networks - Bast et al. Yesterday we looked at the CS problems behind startups using mobile workforces to build variations on the pick-up and delivery problem. Today I thought it would be fun to look at a related problem - journey planning. What's the tech behind city navigation and journey planning … Continue reading Route planning in transportation networks
Dynamic vehicle routing, pickup, and delivery problems
A double-header today: A review of dynamic vehicle routing problems - Pillac et al. 2012, and Dynamic pickup and delivery problems - Berbeglia et al. 2010 With the ubiquity of location-enabled smartphones we're increasingly seeing new startup businesses that take advantage of a mobile workforce to pick up and deliver goods (e.g. groceries, meals, parcels) … Continue reading Dynamic vehicle routing, pickup, and delivery problems
An unsupervised algorithm for person-name disambiguation on the web
An unsupervised algorithm for person-name disambiguation on the web - Delgado et al. 2014 Many people share the same name. When you search for a person by name on the web, the results you get back are page ranked without consideration to the individual they refer to. If you're searching for a person who shares … Continue reading An unsupervised algorithm for person-name disambiguation on the web
Analysis of join-the-shortest-queue routing
Analysis of join-the-shortest queue routing for web server farms - Gupter et al 2007 What's the best way to balance web requests across a set of servers? Round-robin is the simple algorithm that everyone knows best, but there is a better way... This paper analyzes the Join the Shortest Queue (JSQ) routing policy and shows … Continue reading Analysis of join-the-shortest-queue routing
Patience is a virtue
Patience is a virtue: revisiting merge and sort on modern processors - Chandramouli and Goldstein 2014 This is a wonderful story. The seeds of an algorithm laid down over 50 years ago, rediscovered, brought up to date, and found to be highly relevant. Our investigation has resulted in some surprising discoveries about a mostly ignored … Continue reading Patience is a virtue
Outperforming LRU with an Adaptive Replacement Cache Algorithm
Outperforming LRU with an Adaptive Replacement Cache Algorithm Megiddo & Modha, 2004 Ask most people to name a cache management algorithm, and the first thing that springs to their mind is likely to be LRU. But it turns out there is a better (lesser-known) algorithm called ARC (Adaptive Replacement Cache), as used for example by … Continue reading Outperforming LRU with an Adaptive Replacement Cache Algorithm