Abstract
The need of processing graph reachability queries stems from many applications that manage complex data as graphs. The applications include transportation network, Internet traffic analyzing, Web navigation, semantic web, chemical informatics and bio-informatics systems, and computer vision. A graph reachability query, as one of the primary tasks, is to find whether two given data objects, u and v, are related in any ways in a large and complex dataset. Formally, the query is about to find if v is reachable from u in a directed graph which is large in size. In this paper, we focus ourselves on building a reachability labeling for a large directed graph, in order to process reachability queries efficiently. Such a labeling needs to be minimized in size for the efficiency of answering the queries, and needs to be computed fast for the efficiency of constructing such a labeling. As such a labeling, 2-hop cover was proposed for arbitrary graphs with theoretical bounds on both the construction cost and the size of the resulting labeling. However, in practice, as reported, the construction cost of 2-hop cover is very high even with super power machines. In this paper, we propose a novel geometry-based algorithm which computes high-quality 2-hop cover fast. Our experimental results verify the effectiveness of our techniques over large real and synthetic graph datasets.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Borgida, A., Jagadish, H.V.: Efficient management of transitive relationships in large data and knowledge bases. In: Proc. of SIGMOD 1989 (1989)
Cohen, E., Halperin, E., Kaplan, H., Zwick, U.: Reachability and distance queries via 2-hop labels. In: Proc. of SODA 2002 (2002)
Cohen, E., Kaplan, H., Milo, T.: Labeling dynamic XML trees. In: Proc. of PODS 2002 (2002)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to algorithms. MIT Press, Cambridge (2001)
Gallo, G., Grigoriadis, M.D., Tarjan, R.E.: A fast parametric maximum flow algorithm and applications. SIAM J. Comput. 18(1), 30–55 (1989)
Guttman, A.: R-trees: A dynamic index structure for spatial searching. In: Proc. of SIGMOD 1984 (1984)
Ioannidis, Y.E.: On the computation of the transitive closure of relational operators. In: Proc. of VLDB 1986 (1986)
Johnson, D.S.: Approximation algorithms for combinatorial problems. In: Proc. of STOC 1973 (1973)
Johnsonbaugh, R., Kalin, M.: A graph generation software package. In: Proc. Of SIGCSE 1991 (1991), http://condor.depaul.edu/rjohnson/algorithm
Kameda, K.: On the vector representation of the reachability in planar directed graphs. Information Processing Letters 3(3) (1975)
Kaplan, H., Milo, T., Shabo, R.: A comparison of labeling schemes for ancestor queries. In: Proc. of SODA 2002 (2002)
Kha, D.D., Yoshikawa, M., Uemura, S.: An XML indexing structure with relative region coordinate. In: Proc. of ICDE 2001 (2001)
Kimber, W.E.: HyTime and SGML: Understanding the HyTime HYQ query language 1.1. Technical report, IBM Corporation (1993)
Knuth, D.E.: The Stanford GraphBase: a platform for combinatorial computing. ACM Press, New York (1993)
Lee, Y.K., Yoo, S.J., Yoon, K.: Index structures for structured documents. In: Proc. Of ACM First International Conference on Digital Libraries (1996)
Lei, S., G.: A graph query language and its query processing. In: Proc. of ICDE 1999 (1999)
Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: Proc. of VLDB 2001 (2001)
Schenkel, R., Theobald, A., Weikum, G.: Hopi: An efficient connection index for complex xml document collections. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 237–255. Springer, Heidelberg (2004)
Schenkel, R., Theobald, A., Weikum, G.: Efficient creation and incremental maintenance of the HOPI index for complex XML document collections. In: Proc. of ICDE 2005 (2005)
Schmidt, A., Waas, F., Kersten, M., Carey, M.J., Manolescu, I., Busse, R.: Xmark: A benchmark for xml data management. In: Proc. of VLDB 2002 (2002)
Tatarnov, I., Viglas, S.D., Beyer, K., Shanmugasundaram, J., Shekita, E., Zhang, C.: Storing and quering ordered XML using a relational database system. In: Proc. of SIGMOD 2002 (2002)
Wang, H., He, H., Yang, J., Yu, P.S., Yu, J.X.: Dual labeling: Answering graph reachabilityqueries in constant time. In: Proc. of ICDE 2006 (2006)
Wang, W., Jiang, H., Lu, H., Yu, J.: Pbitree coding and efficient processing of containment join. In: Proc. of ICDE 2003 (2003)
YoshiKawa, M., Amagasa, T.: XRel: A path-based approach to storage and retrieval of XML documents using relational databases. ACM Transactions on Internet Technology 1(1) (2001)
Zhang, C., Naughton, J.F., DeWitt, D.J., Luo, Q., Lohman, G.M.: On supporting containment queries in relational database management systems. In: Proc. of SIGMOD 2001 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cheng, J., Yu, J.X., Lin, X., Wang, H., Yu, P.S. (2006). Fast Computation of Reachability Labeling for Large Graphs. In: Ioannidis, Y., et al. Advances in Database Technology - EDBT 2006. EDBT 2006. Lecture Notes in Computer Science, vol 3896. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11687238_56
Download citation
DOI: https://doi.org/10.1007/11687238_56
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32960-2
Online ISBN: 978-3-540-32961-9
eBook Packages: Computer ScienceComputer Science (R0)