Columnstore and B+ tree – are hybrid physical designs important?

September 28, 2018September 23, 2018 ~ adriancolyer ~ 12 Comments

Columnstore and B+ tree - are hybrid physical designs important? Dziedzic et al., SIGMOD'18 Earlier this week we looked at the design of column stores and their advantages for analytic workloads. What should you do though if you have a mixed workload including transaction processing, decision support, and operational analytics? Microsoft SQL Server supports hybrid … Continue reading Columnstore and B+ tree – are hybrid physical designs important?

The design and implementation of modern column-oriented database systems

September 26, 2018September 23, 2018 ~ adriancolyer ~ 11 Comments

The design and implementation of modern column-oriented database systems Abadi et al., Foundations and trends in databases, 2012 I came here by following the references in the Smoke paper we looked at earlier this week. "The design and implementation of modern column-oriented database systems" is a longer piece at 87 pages, but it’s good value-for-time. … Continue reading The design and implementation of modern column-oriented database systems

Smoke: fine-grained lineage at interactive speed

September 24, 2018September 23, 2018 ~ adriancolyer ~ 5 Comments

Smoke: fine-grained lineage at interactive speed Psallidas et al., VLDB'18 Data lineage connects the input and output data items of a computation. Given a set of output records, a backward lineage query selects a subset of the output records and asks "which input records contributed to these results?" A forward lineage query selects a subset … Continue reading Smoke: fine-grained lineage at interactive speed

Same-different problems strain convolutional neural networks

September 21, 2018September 13, 2018 ~ adriancolyer ~ 1 Comment

Same-different problems strain convolutional neural networks Ricci et al., arXiv 2018 Since we’ve been looking at the idea of adding structured representations and relational reasoning to deep learning systems, I thought it would be interesting to finish off the week with an example of a problem that seems to require it: detecting whether objects in … Continue reading Same-different problems strain convolutional neural networks

Relational inductive biases, deep learning, and graph networks

September 19, 2018September 13, 2018 ~ adriancolyer ~ 9 Comments

Relational inductive biases, deep learning, and graph networks Battaglia et al., arXiv'18 Earlier this week we saw the argument that causal reasoning (where most of the interesting questions lie!) requires more than just associational machine learning. Structural causal models have at their core a graph of entities and relationships between them. Today we’ll be looking … Continue reading Relational inductive biases, deep learning, and graph networks

The seven tools of causal inference with reflections on machine learning

September 17, 2018September 13, 2018 ~ adriancolyer ~ 8 Comments

The seven tools of causal inference with reflections on machine learning Pearl, CACM 2018 With thanks to @osmandros for sending me a link to this paper on twitter. In this technical report Judea Pearl reflects on some of the limitations of machine learning systems that are based solely on statistical interpretation of data. To understand … Continue reading The seven tools of causal inference with reflections on machine learning

An empirical analysis of anonymity in Zcash

September 14, 2018September 7, 2018 ~ adriancolyer

An empirical analysis of anonymity in Zcash Kappos et al., USENIX Security'18 As we’ve seen before, in practice Bitcoin offers little in the way of anonymity. Zcash on the other hand was carefully designed with privacy in mind. It offers strong theoretical guarantees concerning privacy. So in theory users of Zcash can remain anonymous. In … Continue reading An empirical analysis of anonymity in Zcash

QSYM: a practical concolic execution engine tailored for hybrid fuzzing

September 12, 2018September 7, 2018 ~ adriancolyer

QSYM: a practical concolic execution engine tailored for hybrid fuzzing Yun et al., USENIX Security 2018 There are two main approaches to automated test case generated for uncovering bugs and vulnerabilities: fuzzing and concolic execution. Fuzzing is good at quickly exploring the input space, but can get stuck when trying to get past more complex … Continue reading QSYM: a practical concolic execution engine tailored for hybrid fuzzing

NAVEX: Precise and scalable exploit generation for dynamic web applications

September 10, 2018September 7, 2018 ~ adriancolyer ~ 10 Comments

NAVEX: Precise and scalable exploit generation for dynamic web applications Alhuzali et al., USENIX Security 2018 NAVEX (https://github.com/aalhuz/navex) is a very powerful tool for finding executable exploits in dynamic web applications. It combines static and dynamic analysis (to cope with dynamically generated web content) to find vulnerable points in web applications, determine whether inputs to … Continue reading NAVEX: Precise and scalable exploit generation for dynamic web applications

Unveiling and quantifying Facebook exploitation of sensitive personal data for advertising purposes

September 7, 2018September 6, 2018 ~ adriancolyer ~ 3 Comments

Unveiling and quantifying Facebook exploitation of sensitive personal data for advertising purposes Cabañas et al., USENIX Security 2018 Earlier this week we saw how the determined can still bypass most browser and tracker-blocking extension protections to track users around the web. Today’s paper is a great example of why you should care about that. Cabañas … Continue reading Unveiling and quantifying Facebook exploitation of sensitive personal data for advertising purposes