Medea: scheduling of long running applications in shared production clusters Garefalakis et al., EuroSys'18 (If you don’t have ACM Digital Library access, the paper can be accessed either by following the link above directly from The Morning Paper blog site). We’re sticking with schedulers today, and a really interesting system called Medea which is designed … Continue reading Medea: scheduling of long running applications in shared production clusters
Tag: Scheduling
Scheduling in distributed systems.
Optimus: an efficient dynamic resource scheduler for deep learning clusters
Optimus: an efficient dynamic resource scheduler for deep learning clusters Peng et al., EuroSys'18 (If you don’t have ACM Digital Library access, the paper can be accessed either by following the link above directly from The Morning Paper blog site). It’s another paper promising to reduce your deep learning training times today. But instead of … Continue reading Optimus: an efficient dynamic resource scheduler for deep learning clusters
Apache Hadoop YARN: Yet another resource negotiator
Apache Hadoop YARN: Yet Another Resource Negotiator Vavilapalli et al., SoCC 2013 The opening section of Prof. Demirbas' reading list is concerned with programming the datacenter, aka 'the Datacenter Operating System' - though I can't help but think of Mesosphere when I hear that latter phrase. There are four papers: in publication order these are … Continue reading Apache Hadoop YARN: Yet another resource negotiator
Morpheus: Towards automated SLOs for enterprise clusters
Morpheus: Towards automated SLOs for enterprise clusters Jyothi et al. OSDI 2016 I'm really impressed with this paper - it covers all the bases from user studies to find out what's really important to end users, to data-driven engineering, a sprinkling of algorithms, a pragmatic implementation being made available in open source, and of course, … Continue reading Morpheus: Towards automated SLOs for enterprise clusters
Firmament: Fast, centralized cluster scheduling at scale
Firmament: Fast, centralized cluster scheduling at scale Gog et al. OSDI' 16 Updated link to point to official usenix hosted version As this paper demonstrates very well, cluster scheduling is a tricky thing to get right at scale. It sounds so simple on the surface: "here are some new jobs/tasks - where should I run … Continue reading Firmament: Fast, centralized cluster scheduling at scale
HCloud: Resource-efficient provisioning in shared cloud systems
HCloud: Resource-efficient provisioning in shared cloud systems - Delimitrou & Kozyrakis, ASPLOS '16 Do you use the public cloud? If so, I'm pretty confident you're going to find today's paper really interesting. Delimitrou & Kozyrakis study the provisioning strategies that provide the best balance between performance and cost. The sweet spot it turns out, is … Continue reading HCloud: Resource-efficient provisioning in shared cloud systems
The Linux Scheduler: a Decade of Wasted Cores
The Linux Scheduler: a Decade of Wasted Cores - Lozi et al. 2016 This is the first in a series of papers from EuroSys 2016. There are three strands here: first of all, there's some great background into how scheduling works in the Linux kernel; secondly, there's a story about Software Aging and how changing … Continue reading The Linux Scheduler: a Decade of Wasted Cores
Universal Packet Scheduling
Universal Packet Scheduling - Mittal et al. 2015 (presented at NSDI '16) Is there a universal scheduling algorithm, such that simply by changing its configuration parameters, we can produce any desired schedule? In Universal Packet Scheduling, Mittal et al. show us that in theory there can be no Universal Packet Scheduling (UPS) algorithm which achieves … Continue reading Universal Packet Scheduling
Split-Level IO Scheduling
Split-Level IO Scheduling - Yang et al. 2015 The central idea in today's paper is pretty simple: block-level I/O schedulers (the most common kind) lack the higher level information necessary to perform write-reordering and accurate accounting, whereas system-call level schedulers have the appropriate context but lack the low-level knowledge needed to build efficient schedulers - … Continue reading Split-Level IO Scheduling
Cloud Computing Resource Scheduling and a Survey of its Evolutionary Approaches
Cloud Computing Resource Scheduling and a Survey of its Evolutionary Approaches - Zhan et al. 2015 In both academia and industry, the problem of cloud resource scheduling is seen to be as hard as a Nondeterministic Polynomial (NP) optimization problem, that is, an NP-hard problem, whose intractability increases exponentially with the number of variables if … Continue reading Cloud Computing Resource Scheduling and a Survey of its Evolutionary Approaches