Abstract
In this paper, we present Para Miner which is a generic and parallel algorithm for closed pattern mining. Para Miner is built on the principles of pattern enumeration in strongly accessible set systems. Its efficiency is due to a novel dataset reduction technique (that we call EL-reduction), combined with novel technique for performing dataset reduction in a parallel execution on a multi-core architecture. We illustrate Para Miner’s genericity by using this algorithm to solve three different pattern mining problems: the frequent itemset mining problem, the mining frequent connected relational graphs problem and the mining gradual itemsets problem. In this paper, we prove the soundness and the completeness of Para Miner. Furthermore, our experiments show that despite being a generic algorithm, Para Miner can compete with specialized state of the art algorithms designed for the pattern mining problems mentioned above. Besides, for the particular problem of gradual itemset mining, Para Miner outperforms the state of the art algorithm by two orders of magnitude.











Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agrawal R, Shafer J (1996) Parallel mining of association rules. IEEE Trans Knowl Data Eng 8(6):962–969
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: International conference on VLDB, pp 487–499
Arimura H, Uno T (2009) Polynomial-delay and polynomial-space algorithms for mining closed sequences, graphs, and pictures in accessible set systems. In: Proceedings SDM, pp 1087–1098
Ayouni S, Laurent A, Yahia SB, Poncelet P (2010) Mining closed gradual patterns. In: ICAISC, pp 267–274
Boley M, Horváth T, Poigné A, Wrobel S (2010) Listing closed sets of strongly accessible set systems with applications to data mining. Theor Comput Sci 411(3):691–700
Bonchi F, Lucchese C (2004) On closed constrained frequent pattern mining. In: Proceedings of IEEE international conference on data mining, ICDM, pp 35–42
Bonchi F, Lucchese C (2007) Extending the state-of-the-art of constraint-based pattern discovery. Data Knowl Eng 60(2):377–399
Buehrer G, Parthasarathy S, Chen YK (2006) Adaptive parallel graph mining for cmp architectures. In: Proceedings of IEEE international conference on data mining, ICDM, pp 97–106
Chaoji V, Hasan MA, Salem S, Zaki MJ (2008) An integrated, generic approach to pattern mining: data mining template library. Data Min Knowl Discov 17(3):457–495
Di-Jorio L, Laurent A, Teisseire M (2009) Mining frequent gradual itemsets from large databases. In: Advances in intelligent data analysis VIII, pp 297–308
Do TDT, Laurent A, Termier A (2010) Pglcm: efficient parallel mining of closed frequent gradual itemsets. In: Proceedings of IEEE international conference on data mining, ICDM, pp 138–147
Flouvat F, Marchi FD, Petit JM (2009) The izi project: easy prototyping of interesting pattern mining algorithms. In: PAKDD workshops, pp 1–15
Ghoting A, Buehrer G, Parthasarathy S, Kim D, Nguyen A, Chen YK, Dubey P (2005) Cache-conscious frequent pattern mining on a modern processor. In: Very large data bases (VLDB), VLDB endowment, pp 577–588
Goethals B (2004) Fimi repository website. http://fimi.cs.helsinki.fi/. Accessed 6 March 2007
Guns T, Nijssen S, Raedt LD (2011) Itemset mining: a constraint programming perspective. Artif Intell 175(12–13):1951–1983
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. Special Interest Group Manag Data (SIGMOD) 29(2):1–12
Imoto S, Goto T, Miyano S (2001) Estimation of genetic networks and functional structures between genes by using Bayesian networks and nonparametric regression. In: PSB’02: Kauai, Hawaii. World Scientific Pub Co Inc, Singapore, 3–7 January 2002, p 175
Lucchese C, Orlando S, Perego R (2007) Parallel mining of frequent closed patterns: harnessing modern computer architectures. In: ICDM, pp 242–251
Mannila H, Toivonen H (1997) Levelwise search and borders of theories in knowledge discovery. Data Min Knowl Discov 1(3):241–258
Negrevergne B (2011) A generic and parallel pattern mining algorithm for multi-core architectures. PhD thesis, University of Grenoble, Grenoble
Negrevergne B, Termier A, Mehaut JF, Uno T (2010) Discovering closed frequent itemsets on multicore: parallelizing computations and optimizing memory accesses. In: Proceedings of HPCS, pp 521–528
Ng RT, Lakshmanan LVS, Han J, Pang A (1998) Exploratory mining and pruning optimizations of constrained association rules. In: SIGMOD conference, pp 13–24
Nijssen S, Kok J (2004) A quickstart in frequent structure mining can make a difference. In: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 647–652
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceedings of ICDT, pp 398–416
Pei J, Han J (2000) Can we push more constraints into frequent pattern mining? In: Conference on knowledge discovery and data mining, KDD, pp 350–354
Pei J, Han J, Lakshmanan LVS (2001) Mining frequent item sets with convertible constraints. In: Proceedings of ICDE, pp 433–442
Soulet A, Crémilleux B (2005) An efficient framework for mining flexible constraints. In: Pacific-Asia conference on knowledge discovery and data mining, PAKDD, pp 661–671
Srikant R, Vu Q, Agrawal R (1997) Mining association rules with item constraints. In: Proceedings of conference on knowledge discovery and data mining, KDD, pp 67–73
Sun X, Yu PS (2007) Hiding sensitive frequent itemsets by a border-based approach. J Comput Sci Eng 1(1):74–94
Tatikonda S, Parthasarathy S (2009) Mining tree-structured data on multicore systems. In: International conference on VLDB, pp 694–705
Uno T, Asai T, Uchida Y, Arimura H (2003) Lcm: an efficient algorithm for enumerating frequent closed item sets. In: Proceedings of IEEE ICDM, vol 3, Citeseer
Uno T, Kiyomi M, Arimura H (2004) Lcm ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: Proceedings of IEEE ICDM’04 Workshop
Uno T, Kiyomi M, Arimura H (2005) Lcm ver. 3: collaboration of array, bitmap and prefix tree for frequent itemset mining. In: OSDM’05 workshop. ACM, New York, pp 77–86
Yan X, Han J (2002) gspan: graph-based substructure pattern mining. In: International conference on data mining (ICDM), pp 721–724
Yan X, Zhou XJ, Han J (2005) Mining closed relational graphs with connectivity constraints. In: ICDE, pp 357–358
Zhu F, Yan X, Han J, Yu P (2007) gprune: a constraint pushing framework for graph pattern mining. In: Advances in knowledge discovery and data mining, pp 388–400
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Jian Pei.
Rights and permissions
About this article
Cite this article
Negrevergne, B., Termier, A., Rousset, MC. et al. Para Miner: a generic pattern mining algorithm for multi-core architectures. Data Min Knowl Disc 28, 593–633 (2014). https://doi.org/10.1007/s10618-013-0313-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-013-0313-2