Mosaic: Processing a trillion-edge graph on a single machine Maass et al., EuroSys'17 Unless your graph is bigger than Facebook's, you can process it on a single machine. With the inception of the internet, large-scale graphs comprising web graphs or social networks have become common. For example, Facebook recently reported their largest social graph comprises … Continue reading Mosaic: processing a trillion-edge graph on a single machine
Tag: Performance
Measuring, troubleshooting and improving performance.
BOAT: Building auto-tuners with structured Bayesian optimization
BOAT: Building auto-tuners with structured Bayesian optimization Dalibard et al., WWW'17 Due to their complexity, modern systems expose many configuration parameters which users must tune to maximize performance... From the number of machines used in a distributed application, to low-level parameters such as compiler flags, managing configurations has become one of the main challenges faced … Continue reading BOAT: Building auto-tuners with structured Bayesian optimization
CherryPick: Adaptively unearthing the best cloud configurations for big data analytics
CherryPick: Adaptively unearthing the best cloud configurations for big data analytics Alipourfard et al., NSDI'17 For big data analytics jobs, especially recurring jobs, finding a good cloud configuration (number and type of machines, CPU, memory ,disk and network options) can make a big different to overall cost and runtimes. Likewise, a poor choice can seriously … Continue reading CherryPick: Adaptively unearthing the best cloud configurations for big data analytics
Improving user perceived page load time using gaze
Improving user perceived page load time using gaze Kelton, Ryoo, et al., NSDI 2017 I feel like I'm stretching things a little bit including this paper in an IoT flavoured week, but it does use at least bridge from the physical world to the virtual, if only via a webcam. What's really interesting here to … Continue reading Improving user perceived page load time using gaze
Stochastic program optimization
Stochastic program optimization Schkufza et al., CACM 2016 Yesterday we saw that DeepCoder can find solutions to simple programming problems using a guided search. DeepCoder needs a custom DSL, and a maximum program length of 5 functions. In 'Stochastic program optimization' Schkufza et al. also use a search strategy to generate code that meets a … Continue reading Stochastic program optimization
Enlightening the I/O path: A holistic approach for application performance
Enlightening the I/O Path: A holistic approach for application performance Kim et al., FAST '17 Lots of applications contain a mix of foreground and background tasks. Since we're at the file system level here, for application, think Redis, MongoDB, PostgreSQL and so on. Typically user requests are considered foreground tasks, and tasks such as housekeeping, … Continue reading Enlightening the I/O path: A holistic approach for application performance
Self-driving database management systems
Self-driving database management systems Pavlo et al., CIDR 2017 We've previously seen many papers looking into how distributed and database systems technologies can support machine learning workloads. Today's paper choice explores what happens when you do it the other way round - i.e., embed machine learning into a DBMS in order to continuously optimise its … Continue reading Self-driving database management systems
Weld: A common runtime for high performance data analytics
Weld: A common runtime for high performance data analytics Palkar et al. CIDR 2017 This is the first in a series of posts looking at papers from CIDR 2017. See yesterday's post for my conference overview. We have a proliferation of data and analytics libraries and frameworks - for example, Spark, TensorFlow, MxNet, Numpy, Pandas, … Continue reading Weld: A common runtime for high performance data analytics
DQBarge: Improving data-quality tradeoffs in large-scale internet services
DQBarge: Improving data-quality tradeoffs in large-scale Internet services Chow et al. OSDI 2106 I'm sure many of you recall the 2009 classic "The Datacenter as a Computer," which encouraged us to think of the datacenter as a warehouse-scale computer. From being glad simply to have such a computer, the bar keeps on moving. We don't … Continue reading DQBarge: Improving data-quality tradeoffs in large-scale internet services
REX: A development platform and online learning approach for runtime emergent software systems
REX: A development platform and online learning approach for runtime emergent software systems Porter et al. OSDI 2016 If you can get beyond the (for my taste, ymmv) somewhat grand claims and odd turns of phrase (e.g., “how the software ‘feels’ at a given point in time” => metrics) then there’s something quite interesting at … Continue reading REX: A development platform and online learning approach for runtime emergent software systems
