Skip to Main content Skip to Navigation
New interface
Conference papers

Attack-tolerant Unequal Probability Sampling Methods over Sliding Window for Distributed Streams

Abstract : Distributed systems increasingly require the processing of large amounts of data, for metrology, safety or security purposes. The on-line processing of these large data streams requires the development of algorithms to efficiently calculate parameters. If elegant solutions have been proposed recently, their approximation is commonly calculated from the inception of the data stream. In a distributed execution context, it would be preferable to collect information only on the recent past (for resource saving or relevancy of most recent information). We therefore consider here the sliding window model. In this article, we propose a family of new sampling techniques that take into account both the sliding window model and the presence of a malicious adversary. Wayne Fuller proposed in 1970 a very ingenious method of sampling with unequal inclusion probabilities. After doing justice to this precursor paper and proposing a fast and simple implementation of it, we completely generalize Fuller's method in order to enable the use of a tuning parameter of spreading. The analytical results of these techniques show the excellent performance of the generalized pivotal approach. This generalization makes the sampling method less predictable and seems appropriate to be protected from malicious attacks when sampling from a stream.
Complete list of metadata

Cited literature [32 references]  Display  Hide  Download
Contributor : Yann Busnel Connect in order to contact the contributor
Submitted on : Tuesday, January 28, 2020 - 11:13:16 AM
Last modification on : Friday, August 5, 2022 - 2:54:52 PM
Long-term archiving on: : Wednesday, April 29, 2020 - 12:54:46 PM


Files produced by the author(s)



Yann Busnel, Yves Tillé. Attack-tolerant Unequal Probability Sampling Methods over Sliding Window for Distributed Streams. ICCDA 2020 : 4th International Conference on Compute and Data Analysis, Mar 2020, San Jose, United States. pp.72-78, ⟨10.1145/3388142.3388162⟩. ⟨hal-02456880⟩



Record views


Files downloads