WANalytics: Analytics for a geo-distributed, data intensive world

WANalytics: analytics for a geo-distributed data intensive world - Vulimiri et al. 2015 ...data is born distributed; we only control data replication and distributed execution strategies. This is true for so many sources of data. Combine this with Dave McCrory's observation that 'Data has Gravity' (i.e. it attracts applications and other data processing workloads to ... Continue Reading

The Log-Structured Merge-Tree (LSM Tree)

The Log-Structured Merge-Tree (LSM Tree) - O'Neil et al. '96. Log-Structured Merge is an important technique used in many modern data stores (for example, BigTable, Cassandra, HBase, Riak, ...). Suppose you have a hierarchy of storage options for data - for example, RAM, SSDs, Spinning disks, with different price/performance characteristics. Furthermore, you have a large ... Continue Reading