SnappyData: A unified cluster for streaming, transactions, and interactive analytics

SnappyData: A unified cluster for streaming, transactions, and interactive analytics Mozafari et al., CIDR 2017 Update: fixed broken paper link, thanks Zteve. On Monday we looked at Weld which showed how to combine disparate data processing and analytic frameworks using a common underlying IR. Yesterday we looked at Peloton that adapts to mixed OLTP and … Continue reading SnappyData: A unified cluster for streaming, transactions, and interactive analytics

Time-adaptive sketches (Ada sketches) for summarizing data streams

Time-adaptive sketches (Ada Sketches) for Summarizing Data Streams Shrivastava et al. SIGMOD 2016 More algorithm fun today, and again in the context of data streams. It’s the 3 V’s of big data, but not as you know it: Volume, Velocity, and Var… Volatility. Volatility here refers to changing patterns in the data over time, and … Continue reading Time-adaptive sketches (Ada sketches) for summarizing data streams

Sharing-aware outlier analytics over high-volume data streams

Sharing-aware outlier analytics over high-volume data streams Cao et al. SIGMOD 2016 With yesterday’s preliminaries on skyline queries out of the way, it’s time to turn our attention to the Sharing-aware Outlier Processing (SOP) algorithm of Cao et al. The challenge that SOP addresses is that of building a stream-based outlier detection system that can … Continue reading Sharing-aware outlier analytics over high-volume data streams

StreamScope: Continuous reliable distributed processing of big data streams

StreamScope: Continuous Reliable Distributed Processing of Big Data Streams - Lin et al. NSDI '16 An emerging trend in big data processing is to extract timely insights from continuous big data streams with distributed computation running on a large cluster of machines. Examples of such data streams include those from sensors, mobile devices, and on-line … Continue reading StreamScope: Continuous reliable distributed processing of big data streams