Automatic database management system tuning through large-scale machine learning Aken et al. , SIGMOD'17 Achieving good performance in DBMSs is non-trivial as they are complex systems with many tunable options that control nearly all aspects of their runtime operation. OtterTune uses machine learning informed by data gathered from previous tuning sessions to tune new DBMS … Continue reading Automatic database management system tuning through large-scale machine learning
Year: 2017
Enabling signal processing over data streams
Enabling signal processing over data streams Nikolic et al., SIGMOD '17 If you're processing data coming from networks of sensors and devices, then it's not uncommon to use a mix of relational and signal processing operations. Data analysts use relational operators, for example, to group signals by different data sources or join signals with historical … Continue reading Enabling signal processing over data streams
Complete event trend detection in high-rate data streams
Complete Event Trend detection in high-rate event streams Poppe et al., SIGMOD'17 Today's paper choice looks at the tricky problem of detecting Complete Event Trends (CET) in high-rate event streams. CET detection is useful in fraud detection, health care analytics, stock trend analytics and other similar scenarios looking for complex patterns in event streams. Detecting … Continue reading Complete event trend detection in high-rate data streams
A general purpose counting filter: making every bit count
A general purpose counting filter: making every bit count Pandey et al., SIGMOD'17 It's been a while since we looked at a full on algorithms and data structures paper, but this one was certainly worth waiting for. We're in the world of Approximate Membership Query (AMQ) data structures, of which probably the best known example … Continue reading A general purpose counting filter: making every bit count
ACIDRain: concurrency-related attacks on database backed web applications
ACIDRain: Concurrency-related attacks on database-backed web applications Warszawski & Bailis, SIGMOD'17 Welcome back to a new term of The Morning Paper. To kick things off, we have 'ACID Rain' - a terrific paper from SIGMOD'17 that pulls together a number of threads we've studied previously: transaction processing, anomalies, and security. What ACIDRain demonstrates is that … Continue reading ACIDRain: concurrency-related attacks on database backed web applications
End of term, and Orders of Magnitude
It's end of term time again. As part of making The Morning Paper habit sustainable I take a few weeks off three times a year to do some more relaxed background reading, recharge my paper queues, and let my mind wander. The Morning Paper will return on Monday 7th August. Here are a few selections … Continue reading End of term, and Orders of Magnitude
Do we need specialized graph databases? Benchmarking real-time social networking applications
Do we need specialized graph databases? Benchmarking real-time social networking applications Pacaci et al., GRADES'17 Today's paper comes from the GRADES workshop co-located with SIGMOD. The authors take an established graph data management system benchmark suite (LDBC) and run it across a variety of graph and relational stores. The findings make for very interesting reading, … Continue reading Do we need specialized graph databases? Benchmarking real-time social networking applications
Using word embedding to enable semantic queries on relational databases
Using word embedding to enable semantic queries in relational databases Bordawekar and Shmeuli, DEEM'17 As I'm sure some of you have figured out, I've started to work through a collection of papers from SIGMOD'17. Strictly speaking, this paper comes from the DEEM workshop held in conjunction with SIGMOD, but it sparked my imagination and I … Continue reading Using word embedding to enable semantic queries on relational databases
Blockbench: a framework for analyzing private blockchains
Blockbench: a framework for analyzing private blockchains Dinh et al., SIGMOD'17 Here's a paper which delivers way more than you might expect from the title alone. First we get a good discussion of private blockchains and why interest in them is growing rapidly. Then the authors analyse the core layers in a private blockchain, and … Continue reading Blockbench: a framework for analyzing private blockchains
Azure Data Lake Store: a hyperscale distributed file service for big data analytics
Azure data lake store: a hyperscale distributed file service for big data analytics Douceur et al., SIGMOD'17 Today's paper takes us inside Microsoft Azure's distributed file service called the Azure Data Lake Store (ADLS). ADLS is the successor to an internal file system called Cosmos, and marries Cosmos semantics with HDFS, supporting both Cosmos and … Continue reading Azure Data Lake Store: a hyperscale distributed file service for big data analytics