Raising Energy Efficiency and Fault Tolerance with Parallel Streaming Application Scheduling on Multicore Systems

Abstract

The widespread use of information technology in the face of scarce resources and deteriorating environmental conditions, occasionally complemented by technical requirements in case of low-power or mobile devices, calls for a sharp focus on energy efficiency. In this dissertation, several problems related to scheduling of parallel streaming applications on multicore systems for energy-efficient execution are discussed. In order to accommodate a surge of heterogeneous devices in recent hardware development, an extension of an existing scheduler based on linear programming to heterogeneous applications and systems is proposed. Furthermore, two techniques are devised which aim to reduce the computational effort for schedule creation while maintaining solution quality: the first one computes a solution to a relaxed problem and heuristically transforms the solution into one to the original problem. The second approach incrementally introduces constraints on various aspects of the initial problem, analyses the search space, and examines the practical implications in terms of scheduling time and energy efficiency. Recommendations for scheduler design are derived, and a novel scheduler is conceived. To cover dynamic behavior of streaming applications, a hybrid technique capable of rapidly adapting schedules at runtime is deployed, both for homogeneous and for heterogeneous systems. If adjusting the hardware design to the application at hand is an option, chip and schedule can be jointly optimized to advance energy efficiency. To facilitate this, an approach based on linear programming is presented. In addition to energy efficiency, fault tolerance is a relevant criterion in practice. A method which adapts the schedule to reflect the loss of available system resources and replicate crashed tasks is provided to compensate for hardware failures, and extreme cases are studied. Moreover, a robustness metric is established to determine a schedule’s tolerance for task delays as well as its potential to make up for delays by tapping into slack and increasing operating frequencies. Several approaches to jointly optimize schedules for energy efficiency and robustness are developed.

Publication
FernUniversität in Hagen

Related