Abstract
Graph simulation as a well studied model of graph pattern matching problem, has been adopted to reduce the complexity and meet the need of novel applications such as mining potential associations between users in online social networks. In recent years, graph processing frameworks such as Pregel bring in a vertex-centric, Bulk Synchronous Parallel (BSP) programming model for processing massive data graphs and achieve encouraging results. However, developing efficient vertex-centric algorithms for graph simulation model is very challenging, because this problem does not naturally align with a vertex-centric programming model. This paper presents novel distributed algorithms based on the vertex-centric programming model for graph simulation. At the same time, considering the enormous cost of the message passing and the algorithm complexity of the pattern matching in the processing of the massive data graph, the part of message passing in the algorithm is optimized to reduce the communication cost. We experimentally verify the effectiveness and efficiency of these algorithms, using real-life massive data graph.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
The graph-tool python library. http://figshare.com/articles/graph_tool/1164194
Brynielsson, J., Högberg, J., Kaati, L., Mårtenson, C., Svenson, P.: Detecting social positions using simulation. In: 2010 International Conference on Advances in Social Networks Analysis and Mining (ASONAM), pp. 48–55. IEEE (2010)
Cohen, J.: Graph twiddling in a mapreduce world. Comput. Sci. Eng. 11(4), 29–41 (2009)
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Fan, W., Li, J., Ma, S., Tang, N., Wu, Y.: Adding regular expressions to graph reachability and pattern queries. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp. 39–50. IEEE (2011)
Fan, W., Li, J., Ma, S., Tang, N., Wu, Y., Wu, Y.: Graph pattern matching: from intractable to polynomial time. Proc. VLDB Endow. 3(1–2), 264–275 (2010)
Fard, A., Nisar, M.U., Ramaswamy, L., Miller, J.A., Saltz, M.: A distributed vertex-centric approach for pattern matching in massive graphs. In: 2013 IEEE International Conference on Big Data, pp. 403–411. IEEE (2013)
Gallagher, B.: Matching structure and semantics: a survey on graph-based pattern matching. AAAI FS 6, 45–53 (2006)
Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: GraphX: graph processing in a distributed dataflow framework. In: OSDI, vol. 14, pp. 599–613 (2014)
He, H., Singh, A.K.: Graphs-at-a-time: query language and access methods for graph databases. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 405–418. ACM (2008)
Henzinger, M.R., Henzinger, T.A., Kopke, P.W.: Computing simulations on finite and infinite graphs. In: 1995 Proceedings of 36th Annual Symposium on Foundations of Computer Science, pp. 453–462. IEEE (1995)
Hosoya, H.: Matching and symmetry of graphs. In: Symmetry, pp. 271–290. Elsevier (1986)
Khan, A., Li, N., Yan, X., Guan, Z., Chakraborty, S., Tao, S.: Neighborhood based fast graph search in large networks. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pp. 901–912. ACM (2011)
Liu, C., Chen, C., Han, J., Yu, P.S.: GPLAG: detection of software plagiarism by program dependence graph analysis. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 872–881. ACM (2006)
Low, Y., Gonzalez, J.E., Kyrola, A., Bickson, D., Guestrin, C.E., Hellerstein, J.: GraphLab: a new framework for parallel machine learning. arXiv preprint arXiv:1408.2041 (2014)
Ma, S., Cao, Y., Fan, W., Huai, J., Wo, T.: Strong simulation: capturing topology in graph pattern matching. ACM Trans. Database Syst. (TODS) 39(1), 4 (2014)
Ma, S., Cao, Y., Huai, J., Wo, T.: Distributed graph pattern matching. In: Proceedings of the 21st International Conference on World Wide Web, pp. 949–958. ACM (2012)
Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 135–146. ACM (2010)
MartÃnez, C., Valiente, G.: An algorithm for graph pattern-matching. In: Proceedings of Fourth South American Workshop on String Processing, vol. 8, pp. 180–197 (1997)
Salihoglu, S., Widom, J.: GPS: a graph processing system. In: Proceedings of the 25th International Conference on Scientific and Statistical Database Management, p. 22. ACM (2013)
Schelter, S.: Large scale graph processing with apache giraph. Invited talk at GameDuell Berlin, 29 May 2012
Tian, Y., Patel, J.M.: Tale: a tool for approximate large graph matching. In: 2008 IEEE 24th International Conference on Data Engineering, ICDE 2008, pp. 963–972. IEEE (2008)
Tong, H., Faloutsos, C., Gallagher, B., Eliassi-Rad, T.: Fast best-effort pattern matching in large attributed graphs. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 737–746. ACM (2007)
Ullmann, J.R.: An algorithm for subgraph isomorphism. J. ACM (JACM) 23(1), 31–42 (1976)
Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)
Yan, X., Yu, P.S., Han, J.: Graph indexing: a frequent structure-based approach. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 335–346. ACM (2004)
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud 2010, p. 10. USENIX Association, Berkeley (2010). http://dl.acm.org/citation.cfm?id=1863103.1863113
Zhao, P., Han, J.: On graph query optimization in large networks. Proc. VLDB Endow. 3(1–2), 340–351 (2010)
Acknowledgements
The authors acknowledge the financial support from the following foundations: National Key R&D Program of China (No. 2017YFC0803700), National Natural Science Foundation of China (61562091), Natural Science Foundation of Yunnan Province (2016FB110).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Li, J., Li, J., Wang, X. (2018). A Vertex-Centric Graph Simulation Algorithm for Large Graphs. In: Xu, Z., Gao, X., Miao, Q., Zhang, Y., Bu, J. (eds) Big Data. Big Data 2018. Communications in Computer and Information Science, vol 945. Springer, Singapore. https://doi.org/10.1007/978-981-13-2922-7_16
Download citation
DOI: https://doi.org/10.1007/978-981-13-2922-7_16
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-2921-0
Online ISBN: 978-981-13-2922-7
eBook Packages: Computer ScienceComputer Science (R0)