Omid reloaded: scalable and highly-available transaction processing

March 17, 2017July 31, 2017 ~ adriancolyer

Omid, reloaded: scalable and highly-available transaction processing Shacham et al., FAST '17 Omid is a transaction processing service powering web-scale production systems at Yahoo that digest billions of events per day and push them into a real-time index. It's also been open-sourced and is currently incubating at Apache as the Apache Omid project. What's interesting … Continue reading Omid reloaded: scalable and highly-available transaction processing

Dependency-driven analytics: a compass for uncharted data oceans

January 20, 2017July 31, 2017 ~ adriancolyer ~ 4 Comments

Dependency-driven analytics: a compass for uncharted data oceans Mavlyutov et al. CIDR 2017 Like yesterday's paper, today's paper considers what to do when you simply have too much data to be able to process it all. Forget data lakes, we're in data ocean territory now. This is a problem Microsoft faced with their large clusters … Continue reading Dependency-driven analytics: a compass for uncharted data oceans

GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server

April 27, 2016July 27, 2017 ~ adriancolyer ~ 2 Comments

GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server - Cui et al. 2016 (EuroSys 2016) We know that deep learning is well suited to GPUs since it has inherent parallelism. But so far this has mostly been limited to either a single GPU (e.g. using Caffe) or to specially built distributed … Continue reading GeePS: Scalable deep learning on distributed GPUs with a GPU-specialized parameter server

MillWheel: Fault-Tolerant Stream Processing at Internet Scale

August 21, 2015July 26, 2017 ~ adriancolyer ~ 4 Comments

MillWheel: Fault-Tolerant Stream Processing at Internet Scale - Akidau et al. (Google) 2013 Earlier this week we looked at the Google Cloud Dataflow model which is implemented on top of FlumeJava (for batch) and MillWheel (for streaming): We have implemented this model internally in FlumeJava, with MillWheel used as the underlying execution engine for streaming … Continue reading MillWheel: Fault-Tolerant Stream Processing at Internet Scale

Heracles: Improving Resource Efficiency at Scale

June 16, 2015July 26, 2017 ~ adriancolyer

Heracles: Improving Resource Efficiency at Scale - Lo et al. 2015 Until recently, scaling from Moore’s law provided higher compute per dollar with every server generation, allowing datacenters to scale without raising the cost. However, with several imminent challenges in technology scaling, alternate approaches are needed. Those approaches involve increasing server utilization, which is still … Continue reading Heracles: Improving Resource Efficiency at Scale

Twitter Heron: Stream Processing at Scale

June 15, 2015July 26, 2017 ~ adriancolyer ~ 14 Comments

Twitter Heron: Stream Processing at Scale - Kulkarni et al. 2015 It's hard to imagine something more damaging to Apache Storm than this. Having read it through, I'm left with the impression that the paper might as well have been titled "Why Storm Sucks", which coming from Twitter themselves is quite a statement. There's a … Continue reading Twitter Heron: Stream Processing at Scale

Pregel: A System for Large-Scale Graph Processing

May 26, 2015July 26, 2017 ~ adriancolyer ~ 13 Comments

Pregel: A System for Large-Scale Graph Processing - Malewicz et al. (Google) 2010 "Many practical computing problems concern large graphs." Yesterday we looked at some of the models for understanding networks and graphs. Today's paper focuses on processing of graphs, especially the efficient processing of large graphs where large can mean billions of vertices and … Continue reading Pregel: A System for Large-Scale Graph Processing

Wormhole: Reliable pub-sub to support Geo-Replicated Internet Services

May 14, 2015July 26, 2017 ~ adriancolyer

Wormhole: Reliable pub-sub to support Geo-Replicated Internet Services - Sharma et al. 2015 At Facebook, lots of applications are interested in data being written to Facebook's data stores. Having each of these applications poll the data stores of interest would be untenable, so Facebook built a pub-sub system to identify updates and transmit notifications to … Continue reading Wormhole: Reliable pub-sub to support Geo-Replicated Internet Services

Musketeer – Part I : What’s the best data processing system?

April 27, 2015July 26, 2017 ~ adriancolyer ~ 18 Comments

Musketeer: all for one, one for all in data processing systems - Gog et al. 2015 For between 40-80% of the jobs submitted to MapReduce systems, you'd be better off just running them on a single machine... It was Eurosys 2015 last week, and a great new crop of papers were presented. Gog et al. … Continue reading Musketeer – Part I : What’s the best data processing system?