Coz: Finding code that counts with causal profiling - Curtsinger & Berger 2015 update: fixed typo in paper title Sticking to the theme of 'understanding what our systems are doing,' but focusing on a single process, Coz is a causal profiler. In essence, it makes the output of a profiler much more useful to you … Continue reading Coz: Finding code that counts with causal profiling
Tag: Performance
Measuring, troubleshooting and improving performance.
The Mystery Machine: End-to-end performance analysis of large-scale internet services
The Mystery Machine: End-to-end performance analysis of large-scale internet services - Chow et al. 2014 Google's Dapper paper is very well known, but Facebook's Mystery Machine seems to be much less well known - and that's a shame because I have a hunch the approach could be very relevant to many people. Current debugging and … Continue reading The Mystery Machine: End-to-end performance analysis of large-scale internet services
PerfBlower: Quickly Detecting Memory-Related Performance Problems via Amplification
PerfBlower: Quickly Detecting Memory-Related Performance Problems via Amplification - Fang et al. 2015 Another ECOOP '15 paper, and definitely something with immediate pragmatic utility. PerfBlower finds heap-related performance problems during regular test runs (not exhaustive performance tests) by amplifying the effects of small issues to make them visible. The user provides details of classes of … Continue reading PerfBlower: Quickly Detecting Memory-Related Performance Problems via Amplification
Optimization Coaching for JavaScript
Optimization Coaching for JavaScript - St-Amour & Guo, 2015 Because modern programming languages heavily rely on compiler optimizations for performance, failure to apply certain key optimizations is often the source of performance issues. To diagnose these performance issues, programmers need insight about what happens during the optimization process. Consider the following program snippet from the … Continue reading Optimization Coaching for JavaScript
Queues don’t matter when you can JUMP them
Queues don't matter when you can JUMP them - Grosvenor et al. 2015 The Cambridge Systems at Scale team are on a roll. Hot on the heels of the excellent Musketeer paper from Eurosys 2015 comes this paper on QJUMP which last week won a best paper award at NSDI'15. Distributed systems design involves trade-offs. … Continue reading Queues don’t matter when you can JUMP them
Making Sense of Performance in Data Analytics Frameworks
Making Sense of Performance in Data Analytics Frameworks - Ousterhout et al. 2015 We all know the causes of poor performance in big data analytics workloads: network I/O, disk I/O, and straggler tasks. Ousterhout et al. set out to try and quantify this, and found that what we think we know isn't necessarily so. Yet … Continue reading Making Sense of Performance in Data Analytics Frameworks
Detecting Discontinuities in Large-Scale Systems
Detecting Discontinuities in Large-Scale Systems - Malik et al 2014. The 7th IEEE/ACM International Conference on Utility and Cloud Computing is coming to London in a couple of weeks time. Many of the papers don't seem to be online yet, but here's one that is. Malik et al. tackle the problem of long-term forecasting for … Continue reading Detecting Discontinuities in Large-Scale Systems
Analysis of join-the-shortest-queue routing
Analysis of join-the-shortest queue routing for web server farms - Gupter et al 2007 What's the best way to balance web requests across a set of servers? Round-robin is the simple algorithm that everyone knows best, but there is a better way... This paper analyzes the Join the Shortest Queue (JSQ) routing policy and shows … Continue reading Analysis of join-the-shortest-queue routing