Distributed Mega-Datasets: The Need for Novel Computing Primitives

"Distributed Mega-Datasets: The Need for Novel Computing Primitives"
Niklas Semmler, Georgios Smaragdakis, and Anja Feldmann.
IEEE ICDCS 2019. [Vision paper]

Abstract:

With the ongoing digitalization, an increasing number of sensors is becoming part of our digital infrastructure. These sensors produce highly, even globally, distributed data streams. The aggregate data rate of these streams far exceeds local storage and computing capabilities. Yet, for radical new services (e.g., predictive maintenance and autonomous driving), which depend on various control loops, this data needs to be analyzed in a timely fashion.

In this position paper, we outline a system architecture that can effectively handle distributed mega-datasets using data aggregation. Hereby, we point out two research challenges: The need for (1) novel computing primitives that allow us to aggregate data at scale across multiple hierarchies (i.e., time and location) while answering a multitude of a priori unknown queries, and (2) transfer optimizations that enable rapid local and global decision making.

Paper :

presentation :

bibtex : [bibtex.html]