Abstract
To represent complex inter-relationships among entities, weighted graphs are more useful than their unweighted counterparts. In a transactional graph setting, researchers have made several attempts to mine weighted frequent subgraphs from a collection of edge-weighted graphs, which will serve as the representative feature of the underlying graph database and can be further used for analysis. As weighted support of any pattern does not hold downward closure property, a property that is often used in frequent pattern mining to control search space, has made weighted frequent substructure mining a tremendously difficult task. This article proposes an efficient weighted frequent subgraph mining framework called WFSM-MaxPWS for graphs with static edge weights. We introduce a new pruning technique called MaxPWS pruning along with canonical labeling of subgraphs, which helps reduce the search space significantly without compromising completeness. Extending the WFSM-MaxPWS framework, we propose another framework called DewgSpan that is capable of mining graphs with dynamic edge weight. DewgSpan utilizes a summarized edge-weight distribution table to overcome the new challenges of dynamic edge-weight settings. Evaluation results show that WFSM-MaxPWS and DewgSpan are significantly faster than the existing MaxW pruning technique of weighted pattern mining.































Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
The data that support the finding of this study is available at https://github.com/cseduashraful/graphdatasets
References
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: VLDB’94, Proceedings of 20th international conference on very large data bases, September 12-15, 1994, Santiago de Chile, Chile, pp 487–499 (1994)
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: SIGMOD ’00: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp 1–12. ACM
Han J, Pei J, Mortazavi-Asl B, Pinto H, Chen Q, Dayal U, Hsu M (2001) Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of the 17th international conference on data engineering, pp 215–224
Islam MA, Rafi MR, Azad Aa, Ovi JA (2022) Weighted frequent sequential pattern mining. Appl Intell 52(1):254–281
Nguyen H, Le T, Nguyen M, Fournier-Viger P, Tseng VS, Vo B (2022) Mining frequent weighted utility itemsets in hierarchical quantitative databases. Knowledge-Based Systems 237:107709
Roy KK, Moon MHH, Rahman MM, Ahmed CF, Leung CK (2021) Mining sequential patterns in uncertain databases using hierarchical index structure. In: Advances in knowledge discovery and data mining: 25th Pacific-Asia Conference, PAKDD 2021, Virtual Event, May 11–14, 2021, Proceedings, Part II, Springer, pp 29–41
Leung CKS, Tanbeer SK (2013) PUF-tree: a compact tree structure for frequent pattern mining of uncertain data. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, pp 13–25
Wang J, Liu C, Fu X, Luo X, Li X (2019) A three-phase approach to differentially private crucial patterns mining over data streams. Computers & Security 82:30–48
Tsuda K, Kudo T (2006) Clustering graphs by weighted substructure mining. In: Proceedings of the 23rd international conference on Machine learning, ACM, pp 953–960
Cheng Z, Flouvat F, Selmaoui-Folcher N (2017) Mining recurrent patterns in a dynamic attributed graph. In: Proceedings of 21st Pacific-Asia conference on knowledge discovery and data mining (PAKDD 2017), Part II, pp 631–643
Huang Z, Ye Y, Li X, Liu F, Chen H (2017) Joint weighted nonnegative matrix factorization for mining attributed graphs. In: Proceedings of 21st Pacific-asia conference on knowledge discovery and data mining (PAKDD 2017), Part I, pp 368–380
Khan A, Akcora CG (2022) Graph-based management and mining of blockchain data. In: Proceedings of the 31st ACM international conference on information & knowledge management, pp 5140–5143
Ning B, Sun Y, Tao X, Li G (2021) Differential privacy protection on weighted graph in wireless networks. Ad hoc networks 110:102303
Gu Z, Liu H, Feng S (2022) Diversity-induced consensus and structured graph learning for multi-view clustering. Appl Intell pp 1–15
Li K, Ye W (202) Semi-supervised node classification via graph learning convolutional neural network. Appl Intell pp 1–13
Ju W, Qin Y, Qiao Z, Luo X, Wang Y, Fu Y, Zhang M (2022) Kernel-based substructure exploration for next poi recommendation. In: 2022 IEEE International conference on data mining (ICDM), IEEE, pp 221–230
Zhang Z, Bu J, Ester M, Li Z, Yao C, Yu Z, Wang C (2021) H2MN: Graph similarity learning with hierarchical hypergraph matching networks. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pp 2274–2284
Yan X, Han J (2002) gspan: Graph-based substructure pattern mining. In: 2002 IEEE International conference on data mining, 2002. Proceedings., IEEE, pp 721–724
Nijssen S, Kok JN (2005) The gaston tool for frequent subgraph mining. Electronic Notes in Theoretical Computer Science 127(1):77–87
Nguyen D, Luo W, Nguyen TD, Venkatesh S, Phung D (2018) Learning graph representation via frequent subgraphs. In: Proceedings of the 2018 SIAM International Conference on Data Mining, SIAM, pp 306–314
Alam MT, Ahmed CF, Samiullah M, Leung CK (2021) Discriminating frequent pattern based supervised graph embedding for classification. In: Advances in knowledge discovery and data mining, pp 16–28
Nowozin S, Tsuda K, Uno T, Kudo T, BakIr G (2007) Weighted substructure mining for image analysis. In: 2007 IEEE Conference on computer vision and pattern recognition, IEEE, pp 1–8
Henderson TA, Podgurski A (2018) Behavioral fault localization by sampling suspicious dynamic control flow subgraphs. In: 2018 IEEE 11th International conference on software testing, verification and validation (ICST), IEEE, pp 93–104
Salehi Z, Ghiasi M, Sami A (2012) A miner for malware detection based on API function calls and their arguments. In: Artificial intelligence and signal processing (AISP), 2012 16th CSI International Symposium on, IEEE, pp 563–568
Du Y, Wang J, Li Q (2017) An android malware detection approach using community structures of weighted function call graphs. IEEE Access 5:17478–17486
Lakhotia A, Preda MD, Giacobazzi R (2013) Fast location of similar code fragments using semantic’juice’. In: Proceedings of the 2nd ACM SIGPLAN program protection and reverse engineering workshop, ACM, pp 5
Ahmed CF, Tanbeer SK, Jeong BS, Lee YK, Choi HJ (2012) Single-pass incremental and interactive mining for weighted frequent patterns. Expert Systems with Applications 39(9):7976–7994
Zou Z, Li J, Gao H, Zhang S (2010) Mining frequent subgraph patterns from uncertain graph data. IEEE Transactions on Knowledge and Data Engineering 22(9):1203–1218
Bogdanov P, Mongiovì M, Singh AK (2011) Mining heavy subgraphs in time-evolving networks. In: 2011 IEEE 11th International conference on data mining, IEEE, pp 81–90
Rozenshtein P, Gionis A (2019) Mining temporal networks. In: Proceedings of the 25th ACM SIGKDD International conference on knowledge discovery & data mining, ACM, pp 3225–3226
Petelin B, Kononenko I, Malačič V, Kukar M (2019) Frequent subgraph mining in oceanographic multi-level directed graphs. Int J Geographical Inf Sci 1–24
Gong Y, Jia L (2019) Research on SVM environment performance of parallel computing based on large data set of machine learning. J Supercomput 1–18
Eichinger F, Böhm K, Huber M (2008) Mining edge-weighted call graphs to localise software bugs. In: Joint european conference on machine learning and knowledge discovery in databases, Springer, pp 333–348
Jiang C, Coenen F (2008) Graph-based image classification by weighting scheme. In: International conference on innovative techniques and applications of artificial intelligence, Springer, pp 63–76
Shinoda M, Ozaki T, Ohkawa T (2009) Weighted frequent subgraph mining in weighted graph databases. In: 2009 IEEE International conference on data mining workshops, IEEE, pp 58–63
Ozaki T, Etoh M (2011) Closed and maximal subgraph mining in internally and externally weighted graph databases. In: Proceedings of the 2011 IEEE International conference on advanced information networking and applications (AINA 2011) Workshops, IEEE, pp 626–631
Alam MT, Roy A, Ahmed CF, Islam MA, Leung CK (2023) UGMINE: utility-based graph mining. Applied Intelligence 53(1):49–68
Eichinger F, Huber M, Böhm K (2010) On the usefulness of weight-based constraints in frequent subgraph mining. In: SGAI Conf., Springer, pp 65–78
Jiang C, Coenen F, Zito M (2010) Frequent sub-graph mining on edge weighted graphs. In: International conference on data warehousing and knowledge discovery, Springer, pp 77–88
Jiang C, Coenen F, Zito M (2010) Finding frequent subgraphs in longitudinal social network data using a weighted graph mining approach. Adv Data Mining Appl 405–416
Elsayed A, Coenen F, Jiang C, Garcia-Finana M, Sluming V (2010) Corpus callosum mr image classification. Knowledge-Based Systems 23(4):330–336
Jiang C, Coenen F, Sanderson R, Zito M (2010) Text classification using graph mining-based feature extraction. In: Research and Development in Intelligent Systems XXVI, Springer, pp 21–34
Lee G, Yun U (2012) Mining weighted frequent sub-graphs with weight and support affinities. In: International workshop on multi-disciplinary trends in artificial intelligence, Springer, pp 224–235
Lee G, Yun U, Kim D (2016) A weight-based approach: frequent graph pattern mining with length-decreasing support constraints using weighted smallest valid extension. Advanced Science Letters 22(9):2480–2484
Babu N, John A (2016) A distributed approach to weighted frequent subgraph mining. In: International conference on on emerging technological trends [ICETT], IEEE, pp 1–7
Gupta A, Thakur H, Gupta T, Yadav S (2017) Regular pattern mining (with jitter) on weighted-directed dynamic graphs. Journal of Engineering Science and Technology 12(2):349–364
Le NT, Vo B, Nguyen LB, Fujita H, Le B (2020) Mining weighted subgraphs in a single large graph. Information Sciences 514:149–165
Le NT, Vo B, Nguyen LB, Le B (2022) OWGraMi: Efficient method for mining weighted subgraphs in a single graph. Expert Syst Appl 117625
Ashraf N, Haque RR, Islam M, Ahmed CF, Leung CK, Mai JJ, Wodi BH et al (2019) WeFreS: weighted frequent subgraph mining in a single large graph. In: Industrial conference on data mining. ibai publishing
Islam MA, Ahmed CF, Leung CK, Hoi CS (2018) WFSM-MaxPWS: an efficient approach for mining weighted frequent subgraphs from edge-weighted graph databases. In: Pacific-Asia conference on knowledge discovery and data mining, Springer, pp 664–676
Zaki MJ, Meira W (2014) Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press, New York, NY, USA
Yan X Graph datasets. http://www.cs.ucsb.edu/~xyan/dataset.htm
Mehmood D, Shafiq B, Vaidya J, Hong Y, Adam N, Atluri V (2012) Privacy-preserving subgraph discovery. In: IFIP Annual conference on data and applications security and privacy, Springer, pp 161–176
Acknowledgements
Out of the two frameworks discussed in this article, a preliminary version of the framework for static edge-weighted substructure mining has been previously published in PAKDD 2018 [50].
Funding
This work is partially supported by (a) University of Dhaka, (b) Natural Sciences and Engineering Research Council of Canada (NSERC), and (c) University of Manitoba.
Author information
Authors and Affiliations
Contributions
Conceptualization: Md. Ashraful Islam & Chowdhury Farhan Ahmed Methodology: Md. Ashraful Islam Formal analysis and investigation: Md. Ashraful Islam, Chowdhury Farhan Ahmed & Md. Tanvir Alam Writing - original draft preparation: Md. Ashraful Islam & Md. Tanvir Alam Writing - review and editing: Chowdhury Farhan Ahmed & Carson Kai-Sang Leung Supervision: Chowdhury Farhan Ahmed & Carson Kai-Sang Leung
Corresponding author
Ethics declarations
Competing Interests
The authors have no competing interests to declare that are relevant to the content of this article.
Ethical and informed consent for data used:
Used data are open-source and have no associated privacy and copyright issues.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Islam, M.A., Ahmed, C.F., Alam, M.T. et al. Graph-based substructure pattern mining with edge-weight. Appl Intell 54, 3756–3785 (2024). https://doi.org/10.1007/s10489-024-05356-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-024-05356-7