The secret-sharer: evaluating and testing unintended memorization in neural networks

The secret sharer: evaluating and testing unintended memorization in neural networks Carlini et al., USENIX Security Symposium 2019 This is a really important paper for anyone working with language or generative models, and just in general for anyone interested in understanding some of the broader implications and possible unintended consequences of deep learning. There’s also … Continue reading The secret-sharer: evaluating and testing unintended memorization in neural networks

Even more amazing papers at VLDB 2019 (that I didn’t have space to cover yet)

We’ve been covering papers from VLDB 2019 for the last three weeks, and next week it will be time to mix things up again. There were so many interesting papers at the conference this year though that I haven’t been able to cover nearly as many as I would like. So today’s post is a … Continue reading Even more amazing papers at VLDB 2019 (that I didn’t have space to cover yet)

Updating graph databases with Cypher

Updating graph databases with Cypher Green et al., VLDB'19 This is the story of a great collaboration between academia, industry, and users of the Cypher graph querying language as created by Neo4j. Beyond Neo4j, Cypher is also supported in SAP HANA Graph, RedisGraph, Agnes Graph, and Memgraph. Cypher for Apache Spark, and Cypher over Gremlin … Continue reading Updating graph databases with Cypher

Fine-grained, secure and efficient data provenance on blockchain systems

Fine-grained, secure and efficient data provenance on blockchain systems Ruan et al., VLDB'19 We haven’t covered a blockchain paper on The Morning Paper for a while, and today’s choice won the best paper award at VLDB’19. The goal here is to enable smart contracts to be written in which the contract logic depends on the … Continue reading Fine-grained, secure and efficient data provenance on blockchain systems

Declarative recursive computation on an RDBMS

Declarative recursive computation on an RDBMS... or, why you should use a database for distributed machine learing Jankov et al., VLDB'19 If you think about a system like Procella that’s combining transactional and analytic workloads on top of a cloud-native architecture, extensions to SQL for streaming, dataflow based materialized views (see e.g. Naiad, Noria, Multiverses, … Continue reading Declarative recursive computation on an RDBMS

Procella: unifying serving and analytical data at YouTube

Procella: unifying serving and analytical data at YouTube Chattopadhyay et al., VLDB'19 Academic papers aren’t usually set to music, but if they were the chorus of Queen’s "I want it all (and I want it now...)" seems appropriate here. Anchored in the primary use case of supporting Google’s YouTube business, what we’re looking at here … Continue reading Procella: unifying serving and analytical data at YouTube

Experiences with approximating queries in Microsoft’s production big-data clusters

Experiences with approximating queries in Microsoft’s production big-data clusters Kandula et al., VLDB'19 I’ve been excited about the potential for approximate query processing in analytic clusters for some time, and this paper describes its use at scale in production. Microsoft’s big data clusters have 10s of thousands of machines, and are used by thousands of … Continue reading Experiences with approximating queries in Microsoft’s production big-data clusters