Omid reloaded: scalable and highly-available transaction processing

Omid, reloaded: scalable and highly-available transaction processing Shacham et al., FAST '17 Omid is a transaction processing service powering web-scale production systems at Yahoo that digest billions of events per day and push them into a real-time index. It's also been open-sourced and is currently incubating at Apache as the Apache Omid project. What's interesting ... Continue Reading

Chronix: Long term storage and retrieval technology for anomaly detection in operational data

Chronix: Long term storage and retrieval technology for anomaly detection in operational data Lautenschlager et al., FAST 2017 Chronix (http://www.chronix.io/ ) is a time-series database optimised to support anomaly detection. It supports a multi-dimensional generic time series data model and has built-in high level functions for time series operations. Chronix also a scheme called "Date-Delta-Compaction" (DDC) ... Continue Reading

Redundancy does not imply fault tolerance: analysis of distributed storage reactions to single errors and corruptions

Redundancy does not imply fault tolerance: analysis of distributed storage reactions to single errors and corruptions Ganesan et al., FAST 2017 It's a tough life being the developer of a distributed datastore. Thanks to the wonderful work of Kyle Kingsbury (aka, @aphyr) and his efforts on Jepsen.io, awareness of data loss and related issues in ... Continue Reading