Self-driving database management systems

January 17, 2017July 31, 2017 ~ adriancolyer ~ 3 Comments

Self-driving database management systems Pavlo et al., CIDR 2017 We've previously seen many papers looking into how distributed and database systems technologies can support machine learning workloads. Today's paper choice explores what happens when you do it the other way round - i.e., embed machine learning into a DBMS in order to continuously optimise its … Continue reading Self-driving database management systems

Weld: A common runtime for high performance data analytics

January 16, 2017July 31, 2017 ~ adriancolyer ~ 8 Comments

Weld: A common runtime for high performance data analytics Palkar et al. CIDR 2017 This is the first in a series of posts looking at papers from CIDR 2017. See yesterday's post for my conference overview. We have a proliferation of data and analytics libraries and frameworks - for example, Spark, TensorFlow, MxNet, Numpy, Pandas, … Continue reading Weld: A common runtime for high performance data analytics

Innovation, experience-based insight and vision at CIDR ’17

January 15, 2017January 15, 2017 ~ adriancolyer ~ 2 Comments

Last week was CIDR 2017, the biennial Conference on Innovative Data Systems Research. CIDR encourages authors to take a whole system perspective and especially values "innovation, experience-based insight, and vision." That's a very good match with the attributes of papers I like to cover on The Morning Paper. So what innovation, insight, and vision does … Continue reading Innovation, experience-based insight and vision at CIDR ’17

Incremental consistency guarantees for replicated objects

January 13, 2017July 31, 2017 ~ adriancolyer ~ 3 Comments

Incremental consistency guarantees for replicated objects Guerraoui et al., OSDI 2016 We know that there's a price to be paid for strong consistency in terms of higher latencies and reduced throughput. We also know that there's a price to be paid for weaker consistency in terms of application correctness and / or programmer difficulty. Furthermore, … Continue reading Incremental consistency guarantees for replicated objects

The many faces of consistency

January 12, 2017July 31, 2017 ~ adriancolyer ~ 7 Comments

The many faces of consistency Aguilera & Terry, IEEE TC on Data Engineering Bulletin, 2016 Update: Mark Vukolic posted a comment to point me to an ACM Survey paper he published together with Paolo Viotti last year that looks at 50 different consistency models for distributed non-transactional storage systems and puts them into a comprehensive … Continue reading The many faces of consistency

Adaptive logging: optimizing logging and recovery costs in distributed in-memory databases

January 11, 2017July 31, 2017 ~ adriancolyer ~ 1 Comment

Adaptive Logging: Optimizing logging and recovery costs in distributed In-memory databases Yao et al., SIGMOD 2016 This is a paper about the trade-offs between transaction throughput and database recovery time. Intuitively for example, you can do a little more work on each transaction (lowering throughput) in order to reduce the time it takes to recover … Continue reading Adaptive logging: optimizing logging and recovery costs in distributed in-memory databases

Shasta: Interactive reporting at scale

January 10, 2017July 31, 2017 ~ adriancolyer ~ 2 Comments

Shasta: Interactive Reporting At Scale Manoharan et al., SIGMOD 2016 You have vast database schemas with hundreds of tables, applications that need to combine OLTP and OLAP functionality, queries that may join 50 or more tables across disparate data sources, oh, and the user is waiting, so you'd better deliver the results online with low … Continue reading Shasta: Interactive reporting at scale

Apache Hadoop YARN: Yet another resource negotiator

January 9, 2017July 31, 2017 ~ adriancolyer ~ 3 Comments

Apache Hadoop YARN: Yet Another Resource Negotiator Vavilapalli et al., SoCC 2013 The opening section of Prof. Demirbas' reading list is concerned with programming the datacenter, aka 'the Datacenter Operating System' - though I can't help but think of Mesosphere when I hear that latter phrase. There are four papers: in publication order these are … Continue reading Apache Hadoop YARN: Yet another resource negotiator

“A Distributed Systems Seminar Reading List,” Spring 2017 edition

January 8, 2017July 31, 2017 ~ adriancolyer ~ 8 Comments

Update: links giving 404s were too confusing, so I've removed links to not-yet published posts and will add them back in at the end of week! Last year we looked at Murat Demirbas' Distributed systems seminar reading list for Spring 2016. Now of course it's 2017 and Prof. Demirbas has a new list of papers … Continue reading “A Distributed Systems Seminar Reading List,” Spring 2017 edition

Strategic attentive writer for learning macro-actions

January 6, 2017July 31, 2017 ~ adriancolyer ~ 5 Comments

Strategic attentive writer for learning macro-actions Vezhnevets et al. (Google DeepMind), NIPS 2016 Baldrick may have a cunning plan, but most Deep Q Networks (DQNs) just react to what's immediately in front of them and what has come before. That is, at any given time step they propose the best action to take there and … Continue reading Strategic attentive writer for learning macro-actions