Probabilistically Bounded Staleness for Practical Partial Quorums

Probabilistically Bounded Staleness for Practical Partial Quorums - Bailis et al. 2012, and Quantifying Eventual Consistency with PBS - Bailis et al. 2014 'Probabilistically Bounded Staleness... ' was the original VLDB '12 paper, and then the authors were invited to submit an extended version to the VLDB Journal ('Quantifying Eventual Consistency...') which was published in … Continue reading Probabilistically Bounded Staleness for Practical Partial Quorums

Mining and Summarizing Customer Reviews

Mining and Summarizing Customer Reviews - Hu and Liu 2004 This is the third of the three 'test-of-time' award winners from KDD'15. From the awards page: The paper introduces the problem of summarizing customer reviews and decomposes the problem into the three steps of (1) mining product features (aspects), (2) identifying opinion sentences and their … Continue reading Mining and Summarizing Customer Reviews

Optimizing Search Engines using Clickthrough Data

Optimizing Search Engines using Clickthrough Data - Joachims, 2002 Today's choice is another KDD 'test-of-time' winner. The paper introduced the problem of ranking documents w.r.t. a query using not explicit user feedback but implicit user feedback in the form of clickthrough data. The author presented the Ranking SVM Algorithm to solve the proposed ranking problem. … Continue reading Optimizing Search Engines using Clickthrough Data

Efficient Algorithms for Public-Private Social Networks

Efficient Algorithms for Public-Private Social Networks - Chierichetti et al. 2015 Today's choice won a best paper award at KDD'15. The authors examine a number of algorithms for computing graph (network) measures in the context of social networks that enable private groups and connections. These are characterised by a large public graph G=(V,E), and for … Continue reading Efficient Algorithms for Public-Private Social Networks

A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes

A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes - Lakkaraju et al. 2015 This is the first of a series of papers from the Knowledge Discovery and Data Mining (KDD'15) conference that we'll look at this week. Today's paper is all about helping high school students in the US who … Continue reading A Machine Learning Framework to Identify Students at Risk of Adverse Academic Outcomes

MillWheel: Fault-Tolerant Stream Processing at Internet Scale

MillWheel: Fault-Tolerant Stream Processing at Internet Scale - Akidau et al. (Google) 2013 Earlier this week we looked at the Google Cloud Dataflow model which is implemented on top of FlumeJava (for batch) and MillWheel (for streaming): We have implemented this model internally in FlumeJava, with MillWheel used as the underlying execution engine for streaming … Continue reading MillWheel: Fault-Tolerant Stream Processing at Internet Scale