Azure accelerated networking: SmartNICs in the public cloud Firestone et al., NSDI'18 We’re still on the ‘beyond CPUs’ theme today, with a great paper from Microsoft detailing their use of FPGAs to accelerate networking in Azure. Microsoft have been doing this since 2015, and hence this paper also serves as a wonderful experience report documenting … Continue reading Azure accelerated networking: SmartNICs in the public cloud
Tag: Microsoft
Microsoft technology and systems
The evolution of continuous experimentation in software product development
The evolution of continuous experimentation in software product development Fabijan et al., ICSE'17 (Author personal version here) If you've been following along with the A/B testing related papers this week and thinking "we should probably do more of that in my company," then today's paper choice is for you. Anchored in experiences at Microsoft, the … Continue reading The evolution of continuous experimentation in software product development
Seven rules of thumb for web site experimenters
Seven rules of thumb for web site experimenters Kohavi et al., KDD'14 Following yesterday's 12 metric interpretation pitfalls, today we're looking at 7 rules of thumb for designing web site experiments. There's a little bit of duplication here, but the paper is packed with great real world examples, and there is some very useful new … Continue reading Seven rules of thumb for web site experimenters
A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments
A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments Dmitriev et al., KDD 2017 Pure Gold! Here we have twelve wonderful lessons in how to avoid expensive mistakes in companies that are trying their best to be data-driven. A huge thank you to the team from Microsoft for sharing their hard-won experiences … Continue reading A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments
Azure Data Lake Store: a hyperscale distributed file service for big data analytics
Azure data lake store: a hyperscale distributed file service for big data analytics Douceur et al., SIGMOD'17 Today's paper takes us inside Microsoft Azure's distributed file service called the Azure Data Lake Store (ADLS). ADLS is the successor to an internal file system called Cosmos, and marries Cosmos semantics with HDFS, supporting both Cosmos and … Continue reading Azure Data Lake Store: a hyperscale distributed file service for big data analytics
Dhalion: self-regulating stream processing in Heron
Dhalion: Self-regulating stream processing in Heron Floratou et al., VLDB 2017 Dhalion follows on nicely from yesterday's paper looking at the modular architecture of Heron, and aims to reduce the "complexity of configuring, managing, and deploying" streaming applications. In particular, streaming applications deployed as Heron topologies, although the authors are keen to point out the … Continue reading Dhalion: self-regulating stream processing in Heron
Gray failure: the Achilles’ heel of cloud-scale systems
Gray failure: the Achilles' heel of cloud-scale systems Huang et al., HotOS'17 If you're going to fail, fail properly dammit! All this limping along in degraded mode, doing your best to mask problems, turns out to be one of the key causes of major availability breakdowns and performance anomalies in cloud-scale systems. Today's HotOS'17 paper … Continue reading Gray failure: the Achilles’ heel of cloud-scale systems
Usage patterns and the economics of the public cloud
Usage patterns and the economics of the public cloud Kilcioglu et al., WWW'17 Illustrating the huge diversity of topics covered at WWW, following yesterday's look at recovering mobile user trajectories from aggregate data, today's choice studies usage variation and pricing models in the public cloud. The basis for the study is data from 'a major … Continue reading Usage patterns and the economics of the public cloud
Dependency-driven analytics: a compass for uncharted data oceans
Dependency-driven analytics: a compass for uncharted data oceans Mavlyutov et al. CIDR 2017 Like yesterday's paper, today's paper considers what to do when you simply have too much data to be able to process it all. Forget data lakes, we're in data ocean territory now. This is a problem Microsoft faced with their large clusters … Continue reading Dependency-driven analytics: a compass for uncharted data oceans
Achieving human parity in conversational speech recognition
Achieving Human Parity in Conversational Speech Recognition Xiong et al. Microsoft Technical Report, 2016 The headline story here is that for the first time a system has been developed that exceeds human performance in one of the most difficult of all human speech recognition tasks: natural conversations held over the telephone. This is known as … Continue reading Achieving human parity in conversational speech recognition