Three key checklists and remedies for trustworthy analysis of online controlled experiments at scale

Three key checklists and remedies for trustworthy analysis of online controlled experiments at scale Fabijan et al., ICSE 2019 Last time out we looked at machine learning at Microsoft, where we learned among other things that using an online controlled experiment (OCE) approach to rolling out changes to ML-centric software is important. Prior to that … Continue reading Three key checklists and remedies for trustworthy analysis of online controlled experiments at scale

Automating chaos experiments in production

Automating chaos experiments in production Basiri et al., ICSE 2019 Are you ready to take your system assurance programme to the next level? This is a fascinating paper from members of Netflix’s Resilience Engineering team describing their chaos engineering initiatives: automated controlled experiments designed to verify hypotheses about how the system should behave under gray … Continue reading Automating chaos experiments in production

Teaching rigorous distributed systems with efficient model checking

Teaching rigorous distributed systems with efficient model checking Michael et al., EuroSys'19 On the surface you might think today’s paper selection an odd pick. It describes the labs environment, DSLabs, developed at the University of Washington to accompany a course in distributed systems. During the ten week course, students implement four different assignments: an exactly-once … Continue reading Teaching rigorous distributed systems with efficient model checking

Fixed it for you: protocol repair using lineage graphs

Fixed it for you: protocol repair using lineage graphs Oldenburg et al., CIDR'19 This is a cool paper on a number of levels. Firstly, the main result that catches my eye is that it’s possible to build a distributed systems ‘debugger’ that can suggest protocol-level fixes. E.g. say you have a system that sometimes sends … Continue reading Fixed it for you: protocol repair using lineage graphs

BLeak: automatically debugging memory leaks in web applications

BLeak: Automatically debugging memory leaks in web applications Vilk & Berger, PLDI'18 BLeak is a Browser Leak debugger that finds memory leaks in web applications. You can use BLeak to test your own applications by following the instructions at http://bleak-detector.org. Guided by BLeak, we identify and fix over 50 memory leaks in popular libraries and … Continue reading BLeak: automatically debugging memory leaks in web applications

Debugging data flows in reactive programs

Debugging data flows in reactive programs Banken et al., ICSE'18 To round off our look at papers from ICSE, here’s a really interesting look at the challenges of debugging reactive applications (with a certain Erik Meijer credited among the authors). ... in recent years the use of Reactive Programming (RP) has exploded. Languages such as … Continue reading Debugging data flows in reactive programs

How not to structure your database-backed web applications: a study of performance bugs in the wild

How not to structure your database-backed web applications: a study of performance bugs in the wild Yang et al., ICSE'18 This is a fascinating study of the problems people get into when using ORMs to handle persistence concerns in their web applications. The authors study real-world applications and distil a catalogue of common performance anti-patterns. … Continue reading How not to structure your database-backed web applications: a study of performance bugs in the wild