We’ve reached the end of term again, and The Morning Paper will be taking a two week break to recharge my batteries and my paper backlog! We covered a lot of ground over the last few months, and I’ve selected a few highlighted papers/posts at the end of this piece to tide you over until Monday 24th April when The Morning Paper will resume normal service.
I’d like to take this opportunity to thank you all once more for reading! The Morning Paper flies in the face of fashion – I write long-form pieces, and although I try to explain the material as simply as I can, the subject matter invariably makes for dense reading at times. The blog is hosted on a WordPress site using a very basic theme (all the cool kids are on Medium I hear), and the primary distribution mechanism is an email list (how 90’s!). Despite all that, The Morning Paper mailing list passed the 10,000 subscriber mark this last quarter – it’s wonderful to know that there are so many people out there interested in this kind of material. I’d also like to thank all of the researchers whose work I get to cover – you make researching and writing The Morning Paper a joy.
While we’re on the subject of thank yous, I’d also like to say thank you to the team at the ACM who recently worked on a mechanism to provide open access to any paper from the ACM Digital Library that is covered on The Morning Paper.
I always try to select papers that are open access, which often means scrabbling around to try and find a version an author has posted on their personal site. As well as opening up new potential content for the blog (for example, the ACM Computing Surveys), being able to link to ACM DL content should hopefully provide more stable links over time. If you see a link to an ACM DL piece in the blog and you’re not an ACM DL subscriber, please don’t be put off – you should be able to click through and download the pdf. Any difficulties just let me know and I’ll look into it for you.
One last thing before we get to the selections, there are now over 550 paper write-ups on this blog! If you’ve joined recently, that means there is a ton of great research you may have missed out on. Currently the only real way to explore that backlog is browsing through the archives by month. During this Easter break, I’m going to try and get my act together with a tagging scheme so that you can more easily find papers of interest from the backlog.
In TMP publication order, here are a few edited highlights from the first three months of 2017:
- Incremental consistency guarantees for replicated objects – the ‘Correctable’ interface that enables speculative computation on possibly inconsistent results.
- Weld: a common runtime for high performance analytics – up to a 31x speed-up when using multiple data processing frameworks in concert.
- Toward sustainable insights, or why polygamy is bad for you – understanding the dangers of false correlations in data exploration
- Quantifying controversy in social media (and the follow-up, Reducing controversy by connecting opposing views) – alerting you when there might be another side to the story that you’re not currently seeing.
- Beyond the words: predicting user personality from heterogeneous information – building a model of your personality from your social media interactions.
- Making smart contracts smarter – smart contracts in blockchains and some of the perils involved in making them robust
- When DNNs go wrong – adversarial examples and what we can learn from them – reminding us that DNNs still don’t ‘think’ quite the same way that we do.
- HopFS: Scaling hierarchical file system metadata using NewSQL databases – how Spotify improved Hadoop cluster throughput by an order of magnitude.
- Thou shalt not depend on me: analysing the use of outdated JavaScript libraries on the web – vulnerable javascript libraries are everywhere, and they’re not getting updated…
- Redundancy does not imply fault tolerance: analysis of distributed storage reactions to single errors and corruptions – the distributed datastore developer’s nightmare continues…
- Enlightening the I/O path: a holistic approach to application performance – thinking about file system request priorities end-to-end leads to up to 53% higher request throughput and 42x better 99%-ile request latency.
- Application crash consistency and performance with CCFS – stronger consistency and better performance, the file system you’ve been waiting for.
- The curious case of the PDF converter that likes Mozart – rethinking privacy dialogs to great effect.
(Yes ok, I had a bit of trouble choosing this time around, that was rather a long list and it was difficult even getting it down to just those picks!).
Also, don’t forget we started working through the top 100 awesome deep learning papers list, and you can find the first week of posts from that starting here, and the second week here.
See you in a couple of weeks, Adrian.