Abstract
The problem of data integration has been around for decades, yet a satisfactory solution has not yet emerged. A new type of system called a polystore has surfaced to partially address the integration problem. Based on experience with our own polystore called BigDAWG, we identify three major roadblocks to an acceptable commercial solution. We offer a new architecture inspired by these three problems that trades some generality for usability. This architecture also exploits modern hardware (i.e., high-speed networks and RDMA) to gain performance. The paper concludes with some promising experimental results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Innova-2 Flex Programmable Network Adapter (2018). https://goo.gl/xNzVD1
Mellanox BlueField SmartNIC (2018). https://goo.gl/dic6HH
Binnig, C., Crotty, A., Galakatos, A., Kraska, T., Zamanian, E.: The end of slow networks: it’s time for a redesign. Proc. VLDB Endow. 9(7), 528–539 (2016)
Chen, P., Gadepally, V., Stonebraker, M.: The BigDAWG monitoring framework. In: 2016 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6. IEEE (2016)
Duggan, J., et al.: The BigDAWG polystore system. ACM SIGMOD Rec. 44(2), 11–16 (2015)
Dziedzic, A., Elmore, A.J., Stonebraker, M.: Data transformation and migration in polystores. In: 2016 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6. IEEE (2016)
Elmore, A., et al.: A demonstration of the BigDAWG polystore system. Proc. VLDB Endow. 8(12), 1908–1911 (2015)
Gadepally, V., et al.: BigDAWG version 0.1. In: 2017 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–7. IEEE (2017)
Gupta, A.M., Gadepally, V., Stonebraker, M.: Cross-engine query execution in federated database systems. In: 2016 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6. IEEE (2016)
Hammer, M., McLeod, D.: On database management system architecture. Technical report, Massachusetts Institute of Technology Cambridge Lab for Computer Science (1979)
Hausenblas, M., Nadeau, J.: Apache drill: interactive ad-hoc analysis at scale. Big Data 1(2), 100–104 (2013)
Kolev, B., et al.: Design and implementation of the CloudMdsQL multistore system. In: CLOSER: Cloud Computing and Services Science, vol. 1, pp. 352–359 (2016)
McLeod, D., Heimbigner, D.: A federated architecture for database systems. In: Proceedings of the National Computer Conference, 19–22 May 1980, pp. 283–289. ACM (1980)
She, Z., Ravishankar, S., Duggan, J.: BigDAWG polystore query optimization through semantic equivalences. In: 2016 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6. IEEE (2016)
Sheth, A.P., Larson, J.A.: Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Comput. Surv. (CSUR) 22(3), 183–236 (1990)
Stonebraker, M., Rowe, L.A.: The Design of Postgres, vol. 15. ACM, New York City (1986)
Tan, R., Chirkova, R., Gadepally, V., Mattson, T.G.: Enabling query processing across heterogeneous data models: a survey. In: 2017 IEEE International Conference on Big Data (Big Data), pp. 3211–3220. IEEE (2017)
Wang, J., et al.: The Myria big data management and analytics system and cloud services. In: CIDR (2017)
Yu, K., Gadepally, V., Stonebraker, M.: Database engine integration and performance analysis of the BigDAWG polystore system. In: 2017 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–7. IEEE (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Yu, X., Gadepally, V., Zdonik, S., Kraska, T., Stonebraker, M. (2019). FastDAWG: Improving Data Migration in the BigDAWG Polystore System. In: Gadepally, V., Mattson, T., Stonebraker, M., Wang, F., Luo, G., Teodoro, G. (eds) Heterogeneous Data Management, Polystores, and Analytics for Healthcare. DMAH Poly 2018 2018. Lecture Notes in Computer Science(), vol 11470. Springer, Cham. https://doi.org/10.1007/978-3-030-14177-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-14177-6_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14176-9
Online ISBN: 978-3-030-14177-6
eBook Packages: Computer ScienceComputer Science (R0)