ASSOCIATION RULES MINING IN BIG DATA

Nataliya Shakhovska; Roman Kaminskyy; Eugen Zasoba; Mykola Tsiutsiura

doi:10.47839/ijc.17.1.946

Authors

Nataliya Shakhovska
Roman Kaminskyy
Eugen Zasoba
Mykola Tsiutsiura

DOI:

https://doi.org/10.47839/ijc.17.1.946

Keywords:

Big data, association rule, data dependency, Apriori, Complexity, parallel processing.

Abstract

The paper proposes a method for Big data analyzing in the presence of different data sources and different methods of processing these data. The Big data definition is given, the main problems of data mining process are described. The concept of association rules is introduced and the method of association rules searching for working with Big Data is modified. The method of finding dependencies is developed, efficiency and possibility of its parallelization are determined. The developed algorithm makes it possible to assert that the task of detecting association dependencies in distributed databases belongs to the class of P-tasks. The algorithm for finding association dependencies is well-solved with MapReduce. The low asymptotic complexity of the developed association rules mining algorithm and a wide set of data types supported for analysis allow to apply the proposed algorithm in practically all subject areas working with association dependencies in the data domain.

References

N. Schahovska, “Datawarehouse and dataspace – information base of decision support system,” in Proceedings of the IEEE 11th International Conference on CAD Systems in Microelectronics (CADSM’2011), 2011.

N. Shakhovska, M. Medykovsky, P. Stakhiv, “Application of algorithms of classification for uncertainty reduction,” Przeglad Elektrotechniczny, vol. 89, no. 4, pp. 284-286, 2013.

M. J. Zaki, “Scalable algorithms for association mining,” IEEE Transactions on Knowledge and Data Engineering, vol. 12, issue 3, pp. 372-390, 2000.

J. Han, J. Pei, Y. Yin, “Mining frequent patterns without candidate generation,” in ACM Sigmod Record, pp. 1-12, 2000.

J. Woo, “Apriori-Map/Reduce algorithm,” in Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), 2012, pp. 1.

X. Y. Yang, Z. Liu, Y. Fu, “MapReduce as a programming model for association rules algorithm on Hadoop,” in Proceedings of the IEEE 3rd International Conference on Information Sciences and Interaction Sciences (ICIS’2010), 2010, pp. 99-102.

R. Agrawal, T. Imieliński, A. Swami, “Mining association rules between sets of items in large databases,” in ACM Sigmod Record, pp. 207-216, 1993.

О. Yu. Pshenychnyj, “Data dependencies mining,” Mathematical Machines and Systems, vol. 1, no. 1, 2012. (in Ukrainian).

M. Delgado, M. D. Ruiz, & D. Sánchez, “New approaches for discovering exception and anomalous rules,” International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 19, issue 2, pp. 361–399, 2011.

M. Hahsler, C. Buchta, B. Grün, K. Hornik, arules: Mining Association Rules and Frequent Itemsets. R package version 1.0-3., 2010, [Online]. Available: http://CRAN.R-project.org/.

F. Berzal, et al., “A new framework to assess association rules,” in Advances in Intelligent Data Analysis, Springer Berlin: Heidelberg, pp. 95–104, 2001.

E. Hüllermeier, “Association rules for expressing gradual dependencies,” in Principles of Data Mining and Knowledge Discovery, Springer, Berlin: Heidelberg, pp. 200–211, 2002.

H. Srivastava, V. Kumar, S. Shiwani, “An efficient enhancement of mining top-K association rule,” International Journal of Advanced Research in Computer Science and Software Engineering, vol. 4, issue 6, June 2014.

D. Hunyadi, “Performance comparison of Apriori and FP-Growth algorithms in generating association rules,” in Proceedings of the European Computing Conference, 2011, pp. 376-381.

A. O. Ogunde, O. Folorunso, A. S. Sodiya, “A partition enhanced mining algorithm for distributed association rule mining systems,” Egyptian Informatics Journal, vol. 16, no. 3, pp. 297-307, 2015.

R. Porkodi, B.L Shivakumar, “An improved association rule mining technique for xml data using Xquery and Apriori algorithm,” pp. 1510-1514, March 2009.

S. Rao, P. Gupta, “Implementing improved algorithm over Apriori data mining association rule algorithm”, IJCST, vol. 3, pp. 489-493, 2012.

V. K. Shrivastava, P. Kumar, K. R. Pardasani, “FP-tree and COFI based approach for mining of multiple level association rules in large databases,” arXiv preprint arXiv:1003.1821, 2010.

K. Khurana, and S. Sharma, “A comparative analysis of association rule mining algorithms,” International Journal of Scientific and Research Publications, vol. 3, issue 5, May 2013.

N. Shakhovska, “Consolidated processing for differential information products,” in Proceedings of the IEEE VIIth International Conference on Perspective Technologies and Methods in MEMS Design (MEMSTECH’2011), 2011.

J. Chen, D. Dosyn, V. Lytvyn, A. Sachenko, “Smart data integration by goal driven ontology learning,” in Advances in Big Data. Proceedings of the 2nd INNS Conference on Big Data, Thessaloniki, Greece, October 23-25, 2016, pp. 283-292.

I. Perova, Y. Bodyanskiy, “Fast medical diagnostics using autoassociative neuro-fuzzy memory,” International Journal of Computing, vol. 16, issue 1, pp. 34-40, 2017. Retrieved from http://computingonline.net/computing/article/view/869.

International Journal of Computing

ASSOCIATION RULES MINING IN BIG DATA

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Information