A New Framework for Evaluating Straggler Detection Mechanisms in MapReduce - IMT Atlantique Accéder directement au contenu
Article Dans Une Revue ACM Transactions on Modeling and Performance Evaluation of Computing Systems Année : 2019

A New Framework for Evaluating Straggler Detection Mechanisms in MapReduce

Résumé

Big Data systems (e.g., Google MapReduce, Apache Hadoop, Apache Spark) rely increasingly on speculative execution to mask slow tasks, also known as stragglers, because a job’s execution time is dominated by the slowest task instance. Big Data systems typically identify stragglers and speculatively run copies of those tasks with the expectation that a copy may complete faster to shorten job execution times. There is a rich body of recent results on straggler mitigation in MapReduce. However, the majority of these do not consider the problem of accurately detecting stragglers. Instead, they adopt a particular straggler detection approach and then study its effectiveness in terms of performance, e.g., reduction in job completion time, or efficiency, e.g., high resource utilization. In this paper, we consider a complete framework for straggler detection and mitigation. We start with a set of metrics that can be used to characterize and detect stragglers including Precision, Recall, Detection Latency, Undetected Time and Fake Positive. We then develop an architectural model by which these metrics can be linked to measures of performance including execution time and system energy overheads. We further conduct a series of experiments to demonstrate which metrics and approaches are more effective in detecting stragglers and are also predictive of effectiveness in terms of performance and energy efficiencies. For example, our results indicate that the default Hadoop straggler detector could be made more effective. In certain case, Precision is low and only 55% of those detected are actual stragglers and the Recall, i.e., percent of actual detected stragglers, is also relatively low at 56%. For the same case, the hierarchical approach (i.e., a green-driven detector based on the default one) achieves a Precision of 99% and a Recall of 29%. This increase in Precision can be translated to achieve lower execution time and energy consumption, and thus higher performance and energy efficiency; compared to the default Hadoop mechanism, the energy consumption is reduced by almost 31%. These results demonstrate how our framework can offer useful insights and be applied in practical settings to characterize and design new straggler detection mechanisms for MapReduce systems.
Fichier principal
Vignette du fichier
paper.pdf (824.69 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02172590 , version 1 (12-07-2019)
hal-02172590 , version 2 (01-08-2019)

Identifiants

  • HAL Id : hal-02172590 , version 1

Citer

Tien-Dat Phan, Guillaume Pallez, Shadi Ibrahim, Padma Raghavan. A New Framework for Evaluating Straggler Detection Mechanisms in MapReduce. ACM Transactions on Modeling and Performance Evaluation of Computing Systems, 2019. ⟨hal-02172590v1⟩
269 Consultations
584 Téléchargements

Partager

Gmail Facebook X LinkedIn More