The evolution of continuous experimentation in software product development

The evolution of continuous experimentation in software product development Fabijan et al., ICSE'17 (Author personal version here) If you've been following along with the A/B testing related papers this week and thinking "we should probably do more of that in my company," then today's paper choice is for you. Anchored in experiences at Microsoft, the … Continue reading The evolution of continuous experimentation in software product development

Peeking at A/B tests: continuous monitoring without pain

Peeking at A/B tests: why it matters, and what to do about it Johari et al., KDD'17 and Continuous monitoring of A/B tests without pain: optional stopping in Bayesian testing Deng, Lu, et al., CEUR'17 Today we have a double header: two papers addressing the challenge of monitoring ongoing experiments. Early stopping in traditional A/B … Continue reading Peeking at A/B tests: continuous monitoring without pain

An efficient bandit algorithm for real-time multivariate optimization

An efficient bandit algorithm for realtime multivariate optimization Hill et al., KDD'17 Aka, "How Amazon improved conversion by 21% in a single week!" Yesterday we saw the hard-won wisdom on display in 'seven myths' recommending that experiments be kept simple and only test one thing at a time, otherwise interpreting the results can get really … Continue reading An efficient bandit algorithm for real-time multivariate optimization

Seven rules of thumb for web site experimenters

Seven rules of thumb for web site experimenters Kohavi et al., KDD'14 Following yesterday's 12 metric interpretation pitfalls, today we're looking at 7 rules of thumb for designing web site experiments. There's a little bit of duplication here, but the paper is packed with great real world examples, and there is some very useful new … Continue reading Seven rules of thumb for web site experimenters

A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments

A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments Dmitriev et al., KDD 2017 Pure Gold! Here we have twelve wonderful lessons in how to avoid expensive mistakes in companies that are trying their best to be data-driven. A huge thank you to the team from Microsoft for sharing their hard-won experiences … Continue reading A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments

Distributed deep neural networks over the cloud, the edge, and end devices

Distributed deep neural networks over the cloud, the edge, and end devices Teerapittayanon et al., ICDCS 17 Earlier this year we looked at Neurosurgeon, in which the authors do a brilliant job of exploring the trade-offs when splitting a DNN such that some layers are processed on an edge device (e.g., mobile phone), and some … Continue reading Distributed deep neural networks over the cloud, the edge, and end devices

CLKSCREW: Exposing the perils of security-oblivious energy management

CLKSCREW: Exposing the perils of security-oblivious energy management Tang et al., USENIX Security '17 This is brilliant and terrifying in equal measure. CLKSCREW demonstrably takes the Trust out of ARM's TrustZone, and it wouldn't be at all surprising if it took the Secure out of SGX too (though the researchers didn't investigate that). It's the … Continue reading CLKSCREW: Exposing the perils of security-oblivious energy management