PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs - Gonzalez et al. 2012 A lot of the time, we want to perform computations on graphs that model the real world. As we saw in Exploring Complex Networks, such graphs often follow a power-law degree distribution (i.e., a few nodes are very highly connected, and many nodes … Continue reading PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs

# Month: May 2015

# Distributed GraphLab: A framework for machine learning and data mining in the cloud

Distributed GraphLab: A framework for machine learning and data mining in the cloud - Low et al. 2012 Two years on from the initial GraphLab paper we looked at yesterday comes this extension to support distributed graph processing for larger graphs, including data mining use cases. In this paper, we extend the GraphLab framework to … Continue reading Distributed GraphLab: A framework for machine learning and data mining in the cloud

# GraphLab: A new framework for parallel machine learning

GraphLab: A new framework for parallel machine learning - Low et al. 2010 In this paper we propose GraphLab, a new parallel framework for ML which exploits the sparse structure and common computational patterns of ML algorithms. GraphLab enables ML experts to easily design and implement efficient scalable parallel algorithms by composing problem specific computation, … Continue reading GraphLab: A new framework for parallel machine learning

# Pregel: A System for Large-Scale Graph Processing

Pregel: A System for Large-Scale Graph Processing - Malewicz et al. (Google) 2010 "Many practical computing problems concern large graphs." Yesterday we looked at some of the models for understanding networks and graphs. Today's paper focuses on processing of graphs, especially the efficient processing of large graphs where large can mean billions of vertices and … Continue reading Pregel: A System for Large-Scale Graph Processing

# Exploring Complex Networks

Exploring Complex Networks - Strogatz 2001 Network anatomy is important to characterize because structure always affects function... Written in 2001, this article - recently recommended by Werner Vogels in his 'Back-to-Basics' series - explores the topic of complex networks. It turns out that the behaviour of individual nodes, and the way that we connect them … Continue reading Exploring Complex Networks

# FAWN: A Fast Array of Wimpy Nodes

FAWN: A Fast Array of Wimpy Nodes - Andersen et al. 2009 A few days ago we looked at FaRM (Fast Remote Memory), which used RDMA to match network speed with the speed of CPUs and got some very impressive results in terms of queries & transactions per second. But maybe there's another way of … Continue reading FAWN: A Fast Array of Wimpy Nodes

# Congestion Avoidance and Control

Congestion Avoidance and Control - Jacobson & Karels, 1988 (** corrected spelling of Jacobs_o_n **) It's October 1986 and there's trouble on the internet. A congestion collapse has reduced the bandwidth between LBL and UC Berkeley by a factor of a thousand. These two sites happened to be 400 yds apart. And that drop in … Continue reading Congestion Avoidance and Control