Flexible partitioning for selective binary theta-joins in a massively parallel setting

Koumarelas, Ioannis; Naskos, Athanasios; Gounaris, Anastasios

doi:10.1007/s10619-017-7214-0

Flexible partitioning for selective binary theta-joins in a massively parallel setting

Published: 20 November 2017

Volume 36, pages 301–337, (2018)
Cite this article

Distributed and Parallel Databases Aims and scope Submit manuscript

Ioannis Koumarelas¹,
Athanasios Naskos² &
Anastasios Gounaris²

363 Accesses
5 Citations
Explore all metrics

Abstract

Efficient join processing plays an important role in big data analysis. In this work, we focus on generic theta joins in a massively parallel environment, such as MapReduce and Spark. Theta joins are notoriously slow due to their inherent quadratic complexity, even when their selectivity is low, e.g., 1%. The main performance bottleneck differs between cases, and is due to any of the following factors or their combination: amount of data being shuffled, memory load on reducers, or computation load on reducers. We propose an ensemble-based partitioning approach that tackles all three aspects. In this way, we can save communication cost, we better respect the memory and computation limitations of reducers and overall, we reduce the total execution time. The key idea behind our partitioning is to cluster join key values following two techniques, namely matrix re-arrangement and agglomerative clustering. These techniques can run either in isolation or in combination. We present thorough experimental results using both band queries on real data and arbitrary synthetic predicates. We show that we can save up to 45% of the communication cost and reduce the computation load of a single reducer up to 50% in band queries, whereas the savings are up to 74 and 80%, respectively, in queries with arbitrary theta predicates. Apart from being effective, the potential benefits of our approach can be estimated before execution from metadata, which allows for informed partitioning decisions. Finally, our solutions are flexible in that they can account for any weighted combination of the three bottleneck factors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

D-JB: An Online Join Method for Skewed and Varied Data Streams

Efficient Large Outer Joins over MapReduce

A Theoretical and Experimental Comparison of Large-Scale Join Algorithms in Spark

Article 23 June 2021

Notes

In our 4-page abstract [16], we provide a preliminary version of the material in Sect. 4. All the remainder material in this work is novel.
In the remainder of this work we will use the terms region, partition and group interchangeably; we will also use the term reducer for the worker node, where local join processing takes place, but this does not imply that we are tailored to a MapReduce setting only.
It is also trivial to express imb as a function of mri and rep through simple algebraic manipulation.
http://www.math.uwaterloo.ca/tsp/concorde/.
TSPk is implemented according to [9], the code of which has been integrated into our codebase under the https://github.com/JohnKoumarelas/binarythetajoins/tree/master/btj/tspk directory.
Available from http://cdiac.ornl.gov/ftp/ndp026c/.

References

Afrati, F., Ullman, J.: Matching bounds for the all-pairs mapreduce problem. In: Proceedings of the 17th International Database Engineering & Applications Symposium, pp. 3–4. ACM (2013)
Afrati, F.N., Sarma, A.D., Salihoglu, S., Ullman, J.D.: Upper and lower bounds on the cost of a map-reduce computation. PVLDB 6(4), 277–288 (2013)
Google Scholar
Afrati, F.N., Ullman, J.D.: Optimizing multiway joins in a map-reduce environment. IEEE Trans. Knowl. Data Eng. 23(9), 1282–1298 (2011)
Article Google Scholar
Beame, P., Koutris, P., Suciu, D.: Skew in parallel query processing. In: PODS, pp. 212–223 (2014)
Chan, H.M., Milner, D.A.: Direct clustering algorithm for group formation in cellular manufacture. J. Manuf. Syst. 1(1), 65–75 (1982)
Article Google Scholar
Chen, S.-Y., Chang, T.-P., Chang, Z.-H.: An efficient theta-join query processing algorithm on mapreduce framework. In: Proceedings of the 2012 International Symposium on Computer, Consumer and Control (IS3C), pp. 686–689. IEEE (2012)
Chu, S., Balazinska, M., Suciu, D.: From theory to practice: efficient join query evaluation in a parallel database system. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31–June 4, 2015, pp. 63–78 (2015)
Chu, X., Ilyas, I.F., Koutris, P.: Distributed data deduplication. PVLDB 9(11), 864–875 (2016)
Google Scholar
Climer, S., Zhang, W.: Rearrangement clustering: pitfalls, remedies, and applications. J. Mach. Learn. Res. 7, 919–943 (2006)
MathSciNet MATH Google Scholar
Crotty, A., Galakatos, A., Dursun, K., Kraska, T., Binnig, C., Çetintemel, U., Zdonik, S.: An architecture for compiling udf-centric workflows. PVLDB 8(12), 1466–1477 (2015)
Google Scholar
Doulkeridis, C., Nørvåg, K.: A survey of large-scale analytical query processing in mapreduce. VLDB J. 23(3), 1–26 (2013)
Google Scholar
Elseidy, M., Elguindy, A., Vitorovic, A., Koch, C.: Scalable and adaptive online joins. PVLDB 7(6), 441–452 (2014)
Google Scholar
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, Burlington (2000)
MATH Google Scholar
Khayyat, Z., Lucia, W., Singh, M., Ouzzani, M., Papotti, P., Quiané-Ruiz, J.-A., Tang, N., Kalnis, P.: Lightning fast and space efficient inequality joins. PVLDB 8(13), 2074–2085 (2015)
Google Scholar
King, J.R.: Machine-component grouping in production flow analysis: an approach using a rank order clustering algorithm. Int. J. Prod. Res. 18(2), 213–232 (1980)
Article MathSciNet Google Scholar
Koumarelas, I., Naskos, A., Gounaris, A.: Binary theta-joins using mapreduce: efficiency analysis and improvements. In: Proceedings of the International Workshop on Algorithms for MapReduce and Beyond (BMR) (in conjunction with EDBT/ICDT’2014), Athens, Greece (2014)
Lenstra, J.K., Rinnooy Kan, A.H.G.: Some simple applications of the travelling salesman problem. Oper. Res. Q. 26(4), 717–733 (1975)
Article MATH Google Scholar
Lenstra, J.K.: Technical noteclustering a data array and the traveling-salesman problem. Oper. Res. 22(2), 413–414 (1974)
Article MATH Google Scholar
Li, F., Ooi, B.C., Tamer Özsu, M., Wu, S.: Distributed data management using mapreduce. ACM Comput. Surv. 46(3), 31 (2014)
Google Scholar
McCormick, W.T., Schweitzer, P.J., White, T.W.: Problem decomposition and data reorganization by a clustering technique. Oper. Res. 20(5), 993–1009 (1972)
Article MATH Google Scholar
Okcan, A., Riedewald, M.: Processing theta-joins using mapreduce. In: SIGMOD Conference, pp. 949–960 (2011)
Okcan, A., Riedewald, M.: Anti-combining for mapreduce. In: SIGMOD Conference, pp. 839–850 (2014)
Ren, K., Kwon, Y.C., Balazinska, M., Howe, B.: Hadoop’s adolescence. PVLDB 6(10), 853–864 (2013)
Google Scholar
Sarma, A.D., He, Y., Chaudhuri, S.: Clusterjoin: a similarity joins framework using map-reduce. PVLDB 7(12), 1059–1070 (2014)
Google Scholar
Tao, Y., Lin, W., Xiao, X.: Minimal mapreduce algorithms. In: SIGMOD Conference, pp. 529–540 (2013)
Tous, R., Gounaris, A., Tripiana, C., Torres, J., Girona, S., Ayguade, E., Labarta, J., Becerra, Y., Carrera, D., Valero, M.: Spark deployment and performance evaluation on the marenostrum supercomputer. In: IEEE BigData (2015)
Vitorovic, A., Elseidy, M., Koch, C.: Load balancing and skew resilience for parallel joins. In: Proceedings of the ICDE (2016)
Yan, K., Zhu, H.: Two MRJs for multi-way theta-join in mapreduce. In: Yan, K., Zhu, H. (eds.) Internet and Distributed Computing Systems, pp. 321–332. Springer, New York (2013)
Chapter Google Scholar
Zhang, C., Li, J., Wu, L.: Optimizing theta-joins in a mapreduce environment. Int. J. Database Theory Appl. 6(4), 91–107 (2013)
Google Scholar
Zhang, X., Chen, L., Wang, M.: Efficient multi-way theta-join processing using mapreduce. PVLDB 5(11), 1184–1195 (2012)
Google Scholar

Download references

Acknowledgements

We would like to thank Jordi Torres, Rubèn Tous and Carlos Tripiana from the Barcelona Supercomputing Center for their help in running the Spark experiments.

Author information

Authors and Affiliations

Department of Informatics, Hasso-Plattner-Institut, Potsdam, Germany
Ioannis Koumarelas
Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
Athanasios Naskos & Anastasios Gounaris

Authors

Ioannis Koumarelas
View author publications
You can also search for this author inPubMed Google Scholar
Athanasios Naskos
View author publications
You can also search for this author inPubMed Google Scholar
Anastasios Gounaris
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Anastasios Gounaris.

Appendix: Additional evaluation results

Table 7 Summary improvements due to the re-arrangement policies grouped by the number of reducers (for band queries)

Full size table

Table 8 Summary improvements due to the re-arrangement policies grouped by the number of bands (for band queries)

Full size table

Table 9 Summary improvements due to the re-arrangement policies grouped by the number of reducers (for band queries)

Full size table

Table 10 Summary improvements due to the re-arrangement policies grouped by the number of reducers (for random queries)

Full size table

Tables 7 and 8 refer to the experiments in Sect. 6.3 for the band queries on solar altitude. Table 8 presents the same results as Table 7 but groups the experiments differently to show the impact of selectivity (in terms of number of bands). Although the behavior differs according to the number of bands, the impact of selectivity is considered to be small. The second column (coverage) shows the percentage of the cases, where an improvement on M-Bucket-I is achieved by any technique, i.e., it answers the question “How frequently matrix re-arrangement leads to improvements?”, whereas the other columns answer the question “How high are the improvements when they happen?”. Further observations that can be drawn are: (i) The higher the number of reducers, the less frequently matrix re-arrangement yields improvements. (ii) The benefits on OF values due to the re-arrangement techniques may come at the expense of a small degradation of imbalance, as shown in the last column, but in general imb is not affected much. (iii) There are several cases where the improvement is very small or negligible. Table 9 shows the corresponding details for the band queries on longitude, where the best improvements on mrcl are up to 44%. Table 10 shows the impact of the re-arrangement techniques on the OFs for random queries with $100 \times 100$ JM. The main observation is that, compared to Table 7, both the coverage and the improvements are higher; e.g., we have observed reductions in rep by 74% (i.e., nearly 4 times less) and in mrcl by 56%. For random queries with $200 \times 200$ JMs, the improvements are of lower magnitude but the coverage is 88% (no detailed results are presented).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Koumarelas, I., Naskos, A. & Gounaris, A. Flexible partitioning for selective binary theta-joins in a massively parallel setting. Distrib Parallel Databases 36, 301–337 (2018). https://doi.org/10.1007/s10619-017-7214-0

Download citation

Published: 20 November 2017
Issue Date: June 2018
DOI: https://doi.org/10.1007/s10619-017-7214-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Flexible partitioning for selective binary theta-joins in a massively parallel setting

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

D-JB: An Online Join Method for Skewed and Varied Data Streams

Efficient Large Outer Joins over MapReduce

A Theoretical and Experimental Comparison of Large-Scale Join Algorithms in Spark

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix: Additional evaluation results

Appendix: Additional evaluation results

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now