Capturing and enhancing in situ system observability for failure detection

Capturing and enhancing in situ system observability for failure detection Huang et al., OSDI'18 The central idea in this paper is simple and brilliant. The place where we have the most relevant information about the health of a process or thread is in the clients that call it. Today the state of the practice is … Continue reading Capturing and enhancing in situ system observability for failure detection

Automatic discovery of tactics in spatio-temporal soccer match data

Automatic discovery of tactics in spatio-temporal soccer match data Decroos et al., KDD'18 Here’s a fun paper to end the week. Data collection from sporting events is now widespread. This fuels an endless thirst for team and player statistics. In terms of football (which shall refer to the game of soccer throughout this write-up) that … Continue reading Automatic discovery of tactics in spatio-temporal soccer match data

Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding

Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding Hundman et al., KDD'18 How do you effectively monitor a spacecraft? That was the question facing NASA’s Jet Propulsion Laboratory as they looked forward towards exponentially increasing telemetry data rates for Earth Science satellites (e.g., around 85 terabytes/day for a Synthetic Aperture Radar satellite). Spacecraft are … Continue reading Detecting spacecraft anomalies using LSTMs and nonparametric dynamic thresholding

Online parameter selection for web-based ranking problems

Online parameter selection for web-based ranking problems Agarwal et al., KDD'18 Last week we looked at production systems from Facebook, Airbnb, and Snap Inc., today it’s the turned of LinkedIn. This paper describes the system and model that LinkedIn use to determine the items to be shown in a user’s feed: It replaces previous hand-tuning … Continue reading Online parameter selection for web-based ranking problems

I know you’ll be back: interpretable new user clustering and churn prediction on a mobile social application

I know you’ll be back: interpretable new user clustering and churn prediction on a mobile social application Yang et al., KDD'18 Churn rates (how fast users abandon your app / service) are really important in modelling a business. If the churn rate is too high, it’s hard to maintain growth. Since acquiring new customers is … Continue reading I know you’ll be back: interpretable new user clustering and churn prediction on a mobile social application

Customized regression model for Airbnb dynamic pricing

Customized regression model for Airbnb dynamic pricing Ye et al., KDD'18 This paper details the methods that Airbnb use to suggest prices to listing hosts (hosts ultimately remain in control of pricing on the Airbnb platform). The proposed strategy model has been deployed in production for more than 1 year at Airbnb. The launch of … Continue reading Customized regression model for Airbnb dynamic pricing

Rosetta: large scale system for text detection and recognition in images

Rosetta: large scale system for text detection and recognition in images Borisyuk et al., KDD'18 Rosetta is Facebook’s production system for extracting text (OCR) from uploaded images. In the last several years, the volume of photos being uploaded to social media platforms has grown exponentially to the order of hundreds of millions every day, presenting … Continue reading Rosetta: large scale system for text detection and recognition in images