Online reconstruction of structural information from datacenter logs

Online reconstruction of structural information from datacenter logs Chothia et al., EuroSys'17 Today's choice brings together a couple of themes that we've previously looked at on The Morning Paper: recovering system information from log files, and dataflows for stream processing. On log files (and tracing), see for example Dapper, the MysteryMachine, lprof, and Pivot tracing. ... Continue Reading

CherryPick: Adaptively unearthing the best cloud configurations for big data analytics

CherryPick: Adaptively unearthing the best cloud configurations for big data analytics Alipourfard et al., NSDI'17 For big data analytics jobs, especially recurring jobs, finding a good cloud configuration (number and type of machines, CPU, memory ,disk and network options) can make a big different to overall cost and runtimes. Likewise, a poor choice can seriously ... Continue Reading

vCorfu: A cloud-scale object store on a shared log

vCorfu: A cloud-scale object store on a shared log Wei et al., NSDI'17 vCorfu builds on the idea of a distributed shared log that we looked at yesterday with CORFU, to construct a distributed object store. We show that vCorfu outperforms Cassandra, a popular state-of-the-art NoSQL store, while providing strong consistency (opacity, read-own-writes), efficient transactions, ... Continue Reading