Open Access Open Access  Restricted Access Subscription Access

NOVEL PRE-PROCESSING FRAMEWORK TO IMPROVE CLASSIFICATION ACCURACY IN OPINION MINING

Helen Josephine V. L., Duraisamy S.

Abstract


The growth of information technology led to the Internet development that in turn helped people in many ways. The major one is to express their views about the products and services through reviews, blogs, feedback, and comments on the website and in social media. The buyers are forced to go through investigation on these reviews/blogs, before choosing any product or service. Out of all online services, Mobile learning app places a vital role to increase the thirst for knowledge. But to identify the suitable mobile learning app, the opinions of the existing customers need to be mined. This research paper analyzes the mobile learning reviews which are available in the corpus. A novel preprocessing framework is proposed in this paper to improve classification accuracy in the dataset - mobile learning app review dataset. The corpus dimension is reduced using SVD through which, the data is prepared for mining. The classification accuracy is evaluated by applying Multinomial Naïve Bayes, Random Forest data mining algorithms and Learning Vector Quantization (LVQ), Elman Neural Network (ENN), Feed Forward Neural Network (FFNN) algorithms with the dataset obtained by the proposed processing method.

Keywords


Pre-processing framework; Opinion Mining; Machine Learning; Neural network.

Full Text:

PDF

References


K. Gibert, J. Izquierdo, G. Holmes, I. Athanasiadis, J. Comas, M. Sànchezpost, “On the role of pre and post processing in environmental data mining,” International Environmental Modelling and Software Society (iEMSs), 2008, pp. 1937-1958.

S. Vijayarani, J. Illamathi, P. Nithya, “Preprocessing techniques for text mining – an overview,” International Journal of Computer Science & Communication Networks, vol. 5, no. 1, pp. 7-16, 2015.

Q. Zhao, H. Wang, Pin Lv., “Joint propagation and refinement for mining opinion words and targets,” Proceedings of the IEEE International Conference on Data Mining Workshop, 14 Dec. 2014.

T. Jiang, M. Zhong, S. Shumei, S. Luo, “Mining opinion word from customer review,” International Journal of Database and Theory and Application, vol. 9, no. 2, pp. 129-136, 2016.

R. Talib, M. K. Hanif, S. Ayesha, F. Fakeeha Fatima, “Text mining: techniques, applications and issues,” International Journal of Advanced Computer Science and Applications, vol. 7, no. 11, pp. 414-418, Nov. 2016.

C. Silva, B. Ribeiro, “The importance of stop word removal on recall values in text categorization,” Proceedings of the IEEE International Joint Conference on Neural Network, Porland, USA, 2003.

K. Amarasinghe, M. Manic, R. Hruska, “Optimal stop word selection for text mining in critical infrastructure domain,” Resilience Week (RWS), 2015, pp. 1-6.

C. Moral, A. de Antonio, R. Imbert, J. Ramírez, “A survey of stemming algorithms in information retrieval,” Information Research, vol. 19, no. 1, March 2014.

A. G. Jivani, “A comparative study of stemming algorithms,” International Journal of Computer Technology and Applications, vol 2, issue 6, pp. 1930-1938, 2011.

J. Singh, V. Gupta, “A systematic review of test stemming techniques,” Artificial Intelligent Review, vol. 48, issue 2, pp. 157-217, August 2017.

G. Slaton, C. Buckley, “Term weighting approaches in automatic text retrieval,” Information Processing and Management, 1988

K. Ghag, K. Shah, “SentiTFIDF sentiment classification using relative term frequency inverse document frequency,” International Journal of Advanced Computer Science and Applications, vol. 5, no. 2, pp. 36-43, 2014.

L. Havrlant, V. Kreinovich, “A simple probabilistic explanation of term frequency-inverse document frequency (tf-idf) heuristic (and variations motivated by this explanation),” International Journal of General Systems, vol. 46, No. 1, pp. 27-36, 2017.

R. Akemi Sinoara, J. Antunes, S. O. Rezende, “Text mining and semantics: a systematic mapping study,” Journal of the Brazilian Computer Society, vol. 23, issue 9, pp. 1-20, June 2017.

D. G. Rees, Essential Statistics, 4th Edition, Chapman and Hall/CRC, 2001.

K. P. Nguyen, H. Q. Phan, “Feasible settings for the adaptive latent semantic analysis: Hk-LSA model,” Proceedings of the 2nd IEEE International Conference on Computational Intelligence and Applications (ICCIA’2017), 2017, pp. 219-224.

Nanda S., M. Sukumar Detection and classification of thyroid nodule using Shearlet coefficients and support vector machine, International Journal of Engineering & Technology, Website: www.sciencepubco.com/index.php/IJET doi: 10.14419/ijet.v6i3.7705

M. Stylianidis, E. Galiotou, C. Sgouropoulou, C. Skourlas, “Opinion mining using an LVQ neural network,” Proceedings of the 21st Pan-Hellenic Conference on Informatics PCI 2017, Larissa, Greece, September 28 - 30, 2017, Article No. 61.

J. L Elman, “Finding structure in time,” Cognitive Science, vol. 14, issue 2, pp. 179-211, 1990.

O. Irsoy, C. Cardie, “Opinion mining with deep recurrent neural networks,” Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, October 25-29, 2014, pp. 720–728.

A. Zell, Simulation of Neural Networks, 1st ed., Addison-Wesley, 1994, 73 p. (in German).

Jump up "Deep learning in neural networks: An overview,” Neural Networks, no. 61, pp. 85–117. 2014. arXiv:1404.7828. doi:10.1016/j.neunet.2014.09.003.

P. Chaovalit, L. Zhou, “Movie review mining: a comparison between supervised and unsupervised classification approaches,” Proceedings of the 38th IEEE Hawaii International Conference on System Sciences, 2015, vol. 4, pp. 112-115.

D.M.W Powers, “Evaluation: from precision, recall and f-measure to roc, informedness, markedness & correction,” Journal of Machine Learning Technologies, vol. 2, issue 1, pp. 37-63, 2011.


Refbacks

  • There are currently no refbacks.