Abstract
Astronomical catalog time series data refer to the data collected at different time, which can provide a comprehensive understanding of the celestial objects’ attributes and expose various astronomical phenomena. Its retrieval is indispensable to astronomy research. However, the existing time series data retrieval methods involve lots of manual work and extremely time-consuming. The complexity will also be augmented by the exponentially growth of observation data. In this paper, we propose an automatic and efficient retrieval method for astronomical catalog time series data. With the goal of identifying the same celestial objects time series data automatically, a cross-match scheme is designed, which labeled a unique MatchID for each record matched with the datum catalog. To accelerate the matching process, an in-memory index structure based on Redis is specially designed, which enables matching speed 1.67 times faster than that of MySQL in massive amounts of data. Moreover, Catalog-Mongo—an improved database of MongoDB—is presented, in which a Data Blocking Algorithm is proposed to improve the data partitioning of MongoDB and accelerate query performance. The experimental results show that the query speed is about 2 times faster than MongoDB and 7.6 to 8.7 times than MySQL.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Change history
05 April 2019
The original version of the chapter starting on p. 284 was revised. The grant numbers of the Joint Research Fund in Astronomy were incorrect in the acknowledgement on p. 297. The original chapter was corrected.
References
Berriman, G.B., Groom, S.L.: How will astronomy archives survive the data tsunami? Commun. ACM 54(12), 52–56 (2011)
Boch, T., Pineau, F.X., Derriere, S.: CDS xMatch service documentation (2016)
Brown, P.G.: Overview of SciDB: large scale array storage, processing and analysis. In: ACM SIGMOD International Conference on Management of Data, pp. 963–968 (2010)
Budavari, T., Lee, M.A.: Xmatch: GPU enhanced astronomic catalog cross-matching. Astrophysics Source Code Library, p. 03021 (2013)
Chilingarian, I., Bartunov, O., Richter, J., Sigaev, T.: PostgreSQL: the suitable DBMS solution for astronomy and astrophysics. Astron. Data Anal. Softw. Syst. (ADASS) 314, 225 (2004)
Chodorow, K.: MongoDB: The Definitive Guide. O’Reilly Media, Inc., Sebastopol (2013)
Damodaran, B.D., Salim, S., Vargese, S.M.: Performance evaluation of MySQL and MongoDB databases. Int. J. Cybern. Inform. 5, 387–394 (2016)
Fan, D., Budav, S.T.R., Norris, P.R., Hopkins, M.A.: Matching radio catalogues with realistic geometry: application to SWIRE and ATLAS. Mon. Not. R. Astron. Soc. 451(2), 1299–1305 (2015)
Gray, J., Nieto-Santisteban, M.A., Szalay, A.S.: The zones algorithm for finding points-near-a-point or cross-matching spatial datasets. Microsoft Research (2007)
Górski, K.M.: HEALPix: a framework for high-resolution discretization and fast analysis of data distributed on the sphere. Astrophys. J. 622(2), 759–771 (2004)
Huijse, P., Estevez, P.A., Protopapas, P., Principe, J.C., Zegers, P.: Computational intelligence challenges and applications on large-scale astronomical time series databases. IEEE Comput. Intell. Mag. 9(3), 27–39 (2015)
Jia, X., Luo, Q.: Multi-assignment single joins for parallel cross-match of astronomic catalogs on heterogeneous clusters. In: Proceedings of the 28th International Conference on Scientific and Statistical Database Management, pp. 1–12 (2016)
Jia, X., Luo, Q., Fan, D.: Cross-matching large astronomical catalogs on heterogeneous clusters, pp. 617–624(2015)
Kunszt, P.Z., Szalay, A.S., Thakar, A.R.: The hierarchical triangular mesh. In: Banday, A.J., Zaroubi, S., Bartelmann, M. (eds.) Mining the Sky, pp. 631–637. Springer, Berlin (2001). https://doi.org/10.1007/10849171_83
Lee, M.A., Budavári, T.: Cross-identification of astronomical catalogs on multiple GPUs. Astron. Data Anal. Softw. Syst. 475, 235 (2013)
Li, L., Tang, D., Liu, T., Liu, H., Li, W., Cui, C.: Optimizing the join operation on hive to accelerate cross-matching in astronomy. In: IEEE International Parallel and Distributed Processing Symposium Workshops, pp. 1735–1745 (2014)
Mesmoudi, A., Hacid, M.S.: A comparison of systems to large-scale data access. In: Han, W.S., Lee, M.L., Muliantara, A., Sanjaya, N.A., Thalheim, B., Zhou, S. (eds.) DASFAA 2014. LNCS, vol. 8505, pp. 161–175. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-43984-5_12
NASA: Jet propulsion laboratory HEALPix homepage. http://healpix.jpl.nasa.gov/
Ochsenbein, F., Bauer, P., Marcout, J.: The VizieR database of astronomical catalogues. Astron. Astrophys. Suppl. 143(1), 23–32 (2000)
Ochsenbein, F., Derriere, S., Nicaisse, S., Schaaff, A.: Clustering the large VizieR catalogues, the CoCat experience. Astron. Data Anal. Softw. Syst. (ADASS) 314(314), 58 (2004)
Planthaber, G., Stonebraker, M., Frew, J.: EarthDB: scalable analysis of MODIS data using SciDB. In: ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, pp. 11–19 (2012)
Richter, S., Quiané-Ruiz, J.A., Schuh, S., Dittrich, J.: Towards zero-overhead static and adaptive indexing in Hadoop. VLDB J. 23(3), 469–494 (2014)
Salvato, M., et al.: Finding counterparts for all-sky X-ray surveys with NWAY: a Bayesian algorithm for cross-matching multiple catalogues. Mon. Not. R. Astron. Soc. 473, 4937–4955 (2018)
Smareglia, R., Laurino, O., Knapic, C.: VODance: VO data access layer service creation made easy, vol. 442, p. 575 (2011)
Soumagnac, M.T., Ofek, E.O.: catsHTM - a tool for fast accessing and cross-matching large astronomical catalogs. ArXiv e-prints (2018)
Taylor, M.: TOPCAT - tool for operations on catalogues and tables. Starlink User Note 253 (2011)
Wang, S., Zhao, Y., Luo, Q., Wu, C., Yang, X.: Accelerating in-memory cross match of astronomical catalogs. In: IEEE International Conference on E-Science, pp. 326–333 (2013)
Wenger, M., Ochsenbein, F., Egret, D., et al.: The SIMBAD astronomical database. The CDS reference database for astronomical objects. Astron. Astrophys. Suppl. 143(1), 9–22 (2000)
White, T., Cutting, D.: Hadoop: The Definitive Guide, vol. 215, no. 11, pp. 1–4. O’reilly Media Inc., sebastopol (2012)
Acknowledgements
This work is supported by the Joint Research Fund in Astronomy (U1531111, U1731243, U1731125) under cooperative agreement between the National Natural Science Foundation of China (NSFC) and Chinese Academy of Sciences (CAS), the National Natural Science Foundation of China (11573019, 61602336).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, B. et al. (2018). An Efficient Retrieval Method for Astronomical Catalog Time Series Data. In: Vaidya, J., Li, J. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2018. Lecture Notes in Computer Science(), vol 11334. Springer, Cham. https://doi.org/10.1007/978-3-030-05051-1_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-05051-1_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05050-4
Online ISBN: 978-3-030-05051-1
eBook Packages: Computer ScienceComputer Science (R0)