Skip to Main content Skip to Navigation
Journal articles

Load-Aware Shedding in Stream Processing Systems

Abstract : Distributed stream processing systems are today gaining momentum as a tool to perform analytics on continuous data streams. Load shedding is a technique used to handle unpredictable spikes in the input load whenever available computing resources are not adequately provisioned. In this paper, we propose Load-Aware Shedding (LAS), a novel load shedding solution that, unlike previous works, does not rely neither on a pre-defined cost model nor on any assumption on the tuple execution duration. Leveraging sketches, LAS efficiently estimates the execution duration of each tuple with small error bounds and uses this knowledge to proactively shed input streams at any operator to limiting queuing latencies while dropping as few tuples as possible. We provide a theoretical analysis proving that LAS is an (ε, δ)-approximation of the optimal online load shedder. Furthermore, through an extensive practical evaluation based on simulations and a prototype, we evaluate its impact on stream processing applications.
Complete list of metadata

https://hal-imt-atlantique.archives-ouvertes.fr/hal-03115253
Contributor : Yann Busnel <>
Submitted on : Tuesday, January 19, 2021 - 2:56:19 PM
Last modification on : Friday, April 16, 2021 - 1:42:16 PM
Long-term archiving on: : Tuesday, April 20, 2021 - 7:41:37 PM

File

paper.pdf
Files produced by the author(s)

Identifiers

Citation

Nicolò Rivetti, Yann Busnel, Leonardo Querzoni. Load-Aware Shedding in Stream Processing Systems. Transactions on Large-Scale Data- and Knowledge-Centered Systems, Springer Berlin / Heidelberg, 2020, pp.121-153. ⟨10.1007/978-3-662-62386-2_5⟩. ⟨hal-03115253⟩

Share

Metrics

Record views

14

Files downloads

13