Dynamics on emerging spaces: modeling the emergence of novelties

Dynamics on expanding spaces: modeling the emergence of novelties Loreto et al., ArXiv 2017 Something a little bit left field today to close out the week. I was drawn into this paper by an MIT Technology Review article entitled "Mathematical model reveals the patterns of how innovations arise." Who wouldn't want to read about that!? … Continue reading Dynamics on emerging spaces: modeling the emergence of novelties

Algorithmic improvements for fast concurrent cuckoo hashing

Algorithmic improvements for fast concurrent cuckoo hashing Li et al, EuroSys 2014 Today’s paper continues the work on optimistic cuckoo hashing that we looked at yesterday, extending it to support multiple writers and even higher throughput. One of the original goals for the research was to take advantage of the hardware transactional memory support in … Continue reading Algorithmic improvements for fast concurrent cuckoo hashing

MemC3: Compact and concurrent Memcache with dumber caching and smarter hashing

MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing Fan et al. NSDI 2013 At the core of this paper is an improved hashing algorithm called optimistic cuckoo hashing, and a CLOCK-based eviction algorithm that works in tandem with it. They are evaluated in the context of Memcached, where combined they give up … Continue reading MemC3: Compact and concurrent Memcache with dumber caching and smarter hashing

Time-adaptive sketches (Ada sketches) for summarizing data streams

Time-adaptive sketches (Ada Sketches) for Summarizing Data Streams Shrivastava et al. SIGMOD 2016 More algorithm fun today, and again in the context of data streams. It’s the 3 V’s of big data, but not as you know it: Volume, Velocity, and Var… Volatility. Volatility here refers to changing patterns in the data over time, and … Continue reading Time-adaptive sketches (Ada sketches) for summarizing data streams

Sharing-aware outlier analytics over high-volume data streams

Sharing-aware outlier analytics over high-volume data streams Cao et al. SIGMOD 2016 With yesterday’s preliminaries on skyline queries out of the way, it’s time to turn our attention to the Sharing-aware Outlier Processing (SOP) algorithm of Cao et al. The challenge that SOP addresses is that of building a stream-based outlier detection system that can … Continue reading Sharing-aware outlier analytics over high-volume data streams

Progressive skyline computation in database systems

Progressive skyline computation in database systems Papadias et al. SIGMOD 2003 I’m still working through some of the papers from SIGMOD 2016 (as some of you spotted, that was the unifying them for last week). But today I’m jumping back to 2003 to provide some context for a streaming analytics paper we’ll be looking at … Continue reading Progressive skyline computation in database systems

Spheres of influence for more effective viral marketing

Spheres of influence for more effective viral marketing Mehmood et al. SIGMOD ’16 In viral marketing the idea is to spread awareness of a brand or campaign by exploiting pre-existing social networks. The received wisdom is that by targeting a few influential individuals, they will be able to spread your marketing message to a large … Continue reading Spheres of influence for more effective viral marketing

Transactional data structure libraries

Transactional Data Structure Libraries Spiegelman et al. PLDI 2016 Today’s choice won a distinguished paper award at the recent PLDI 2016 conference. Spiegelman et al. show how to add transactional support to in-memory concurrent data structure libraries in a way that doesn’t sacrifice performance. Since the advent of the multi-core revolution, many efforts have been … Continue reading Transactional data structure libraries

SocialHash: An assignment framework for optimizing distributed systems operations on social networks

SocialHash: An assignment framework for optimizing distributed systems operations on social networks - Shalita et al., NSDI '16 Large scale systems frequently need to partition resources or load across multiple nodes. How you do that can make a big difference. A common approach is to use a random distribution (e.g. via consistent hashing), which usually … Continue reading SocialHash: An assignment framework for optimizing distributed systems operations on social networks