Flexible and Adaptive Stream Join Algorithm
Flexibility and self-adaptivity are important to real-time join processing in a parallel shared-nothing environment.Join-Matrix is a high-performance model on distributed stream joins and supports arbitrary join predicates.It can handle data skew perfectly since it randomly routes tuples to cells with each steam corresponding to one side of the matrix.Designing of the partitioning scheme of the matrix is a determining factor to maximize system throughputs under the premise of economizing computing resources.In this paper,we propose a novel flexible and adaptive scheme partitioning algorithm for stream join operator,which ensures high throughput but with economical resource usages by allocating resources on demand.Specifically,a lightweight scheme generator,which requires the sample of each stream volume and processing resource quota of each physical machine,generates a join scheme;then a migration plan generator decides how to migrate data among machines under the consideration of minimizing migration cost while ensuring correctness.Extensive experiments are done on different kind of join workloads and show high competence comparing with baseline systems on benchmark.
Junhua Fang Xiaotong Wang Rong Zhang Aoying Zhou
Institute for Data Science and Engineering,Software Engineering Institute,East China Normal University,Shanghai,China
国际会议
International Asia-Pacific Web Conference(第18届国际亚太互联网大会)
苏州
英文
3-16
2016-09-23(万方平台首次上网日期,不代表论文的发表时间)