Diabetes Prediction Using Binary Grey Wolf Optimization and Decision Tree


  • Layla A. Al.Hak




Diabetes, Decision tree, Grey Wolf Optimization


Type 2 diabetes is a well-known lifelong condition disease that reduces the human body’s ability to produce insulin. This causes high blood sugar levels, which leads to different complications, including stroke, eye, cardiovascular, kidney, and nerve damage. Although diabetes has attained the attention of huge research, the classification performance of such medical problems utilizing techniques of machine learning is quite low, primarily due to the class imbalance and the presence of missing values in data. In this work, we proposed a model using binary Grey wolf optimization (GWO) and a Decision tree. The proposed model is composed of preprocessing, feature selection, and classification. In preprocessing, that is responsible for minority class oversampling and handling missing values. In the second step, binary GWO are used to select the most significant features. In the third step, the proposed model is trained using the Decision tree algorithm. The model achieved an accuracy of 83.11% when it was applied on the Pima Indian`s dataset.


American Diabetes Association, “Standards of medical care in diabetes – 2018 abridged for primary care providers,” Clin. Diabetes, vol. 36, no. 1, pp. 14–37, 2018. https://doi.org/10.2337/cd17-0119.

IDF Diabetes Atlas, International Diabetes Federation, 8th edition, 2017. https://diabetesatlas.org/upload/resources/previous/files/8/IDF_DA_8e-EN-final.pdf

D. K. Choubey, S. Paul, V. K. Dhandhania, (2019). GA_NN: An Intelligent Classification System for Diabetes. In: Bansal, J., Das, K., Nagar, A., Deep, K., Ojha, A. (eds) Soft Computing for Problem Solving. Advances in Intelligent Systems and Computing, 2019, vol 817, pp. 11-23. Springer, Singapore. https://doi.org/10.1007/978-981-13-1595-4_2.

C. Mallika and S. Selvamuthukumaran, “A hybrid crow search and grey wolf optimization technique for enhanced medical data classification in diabetes diagnosis system,” Int. J. Comput. Intell. Syst., vol. 14, no. 1, pp. 1–18, 2021. https://doi.org/10.1007/s44196-021-00013-0.

T. M. Le, T. M. Vo, T. A. N. N. Pham, S. O. N. Vu, and T. Dao, “A novel wrapper – based feature selection for early diabetes prediction enhanced with a metaheuristic,” IEEE Access, vol. 9, pp. 7869–7884, 2021. https://doi.org/10.1109/ACCESS.2020.3047942.

D. K. Choubey, P. Kumar, S. Tripathi, and S. Kumar, “Performance evaluation of classification methods with PCA and PSO for diabetes,” Netw. Model. Anal. Heal. Informatics Bioinforma., vol. 9, no. 5, pp. 1–30, 2020. https://doi.org/10.1007/s13721-019-0210-8.

Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju, and H. Tang, “Predicting diabetes mellitus with machine learning techniques,” Front. Genet., vol. 9, pp. 1–10, 2018. https://doi.org/10.3389/fgene.2018.00515.

S. Selvakumar, K. S. Kannan, and S. Gothainachiyar, “Prediction of diabetes diagnosis using classification based data mining techniques,” Int. J. Stat. Syst., vol. 12, no. 2, pp. 183–188, 2017.

C. Azad, B. Bhushan, R. Sharma, A. Shankar, K. K. Singh, and A. Khamparia, “Prediction model using SMOTE, genetic algorithm and decision tree (PMSGD) for classification of diabetes mellitus,” Multimed. Syst., vol. 28, pp. 1289-1307, 2021. https://doi.org/10.1007/s00530-021-00817-2.

D. Sisodia and D. S. Sisodia, “Prediction of diabetes using classification algorithms,” Procedia Comput. Sci., vol. 132, pp. 1578–1585, 2018. https://doi.org/10.1016/j.procs.2018.05.122.

R. Cheruku, D. R. Edla, V. Kuppili, and R. Dharavath, “A fuzzy rule miner integrating rough set feature selection and bat optimization for detection of diabetes disease,” Appl. Soft Comput., vol. 67, pp. 764–780, 2018. https://doi.org/10.1016/j.asoc.2017.06.032.

A. M. Zeki, R. Taha, and S. Alshakrani, “Developing a predictive model for diabetes using data mining techniques,” Proceedings of the 2021 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT), 2021, pp. 24–28. https://doi.org/10.1109/3ICT53449.2021.9582114.

G. D. Kalyankar, S. R. Poojara, and N. V Dharwadkar, “Predictive analysis of diabetic patient data using machine learning and hadoop,” Proceedings of the 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), 2017, pp. 619–624. https://doi.org/10.1109/I-SMAC.2017.8058253.

A. B. García, “Brief update on diabetes for general practitioners,” Rev Esp Sanid Penit, vol. 19, pp. 57–65, 2017.

S. Watson, “Everything you need to know about diabetes,” 2020. [Online]. Available at: https://www.healthline.com/health/diabetes.

Mayo Clinic, “Diabetes,” 2020. [Online]. Available at: https://www.mayoclinic.org/diseases-conditions/diabetes/symptoms-causes/syc-20371444.

S. Mirjalili, S. Mohammad, and A. Lewis, “Grey wolf optimizer,” Adv. Eng. Softw., vol. 69, pp. 46–61, 2014. https://doi.org/10.1016/j.advengsoft.2013.12.007.

X. Chen, J. Tuo, and Y. Wang, “A prediction method for blood glucose based on grey wolf optimization evolving kernel extreme learning machine,” Proceedings of the 2019 Chinese Control Conference (CCC), 2019, pp. 3000–3005. https://doi.org/10.23919/ChiCC.2019.8866210.

Q. Li et al., “An enhanced grey wolf optimization based machine for medical diagnosis,” Comput. Math. Methods Med., vol. 2017, pp. 1–15, 2017. https://doi.org/10.1155/2017/9512741.

J. R. Quinlan, “Generating production rules from decision trees,” Proceedings of the 10th International Joint Conference on Artificial Intelligence IJCAI’87, 1987, vol. 87, pp. 304–307.

E. Z. Aziza, L. M. El Amine, M. Mohamed, and B. Abdelhafid, “Decision tree CART algorithm for diabetic retinopathy classification,” Proceedings of the 6th International Conference on Image and Signal Processing and Their Applications (ISPA), 2019, pp. 1–5. https://doi.org/10.1109/ISPA48434.2019.8966905.

B. S. Babu, A. Suneetha, G. C. Babu, Y. J. N. Kumar, and G. Karuna, “Medical disease prediction using grey wolf optimization and auto encoder based recurrent neural network,” Period. Eng. Nat. Sci., vol. 6, no. 1, pp. 229–240, 2018. https://doi.org/10.21533/pen.v6i1.286.

N. A. Saeed and Z. T. M. Al-Ta’i, “Feature selection using hybrid dragonfly algorithm in a heart disease predication system,” Int. J. Eng. Adv. Technol., vol. 8, no. 6, pp. 2862–2867, 2019. https://doi.org/10.35940/ijeat.F8786.088619.

N. A. Saeed and Z. T. M. Al-Ta’i, “Heart disease prediction system using optimization techniques,” in New Trends in Information and Communications Technology Applications, 2020, pp. 167–177. https://doi.org/10.1007/978-3-030-55340-1_12.

“UCI Machine Learning Repository: Pima Indians Diabetes Database,” [Online]. Available at: https://archive.ics.uci.edu/ml/index.php.




How to Cite

Al.Hak, L. A. (2022). Diabetes Prediction Using Binary Grey Wolf Optimization and Decision Tree. International Journal of Computing, 21(4), 489-494. https://doi.org/10.47839/ijc.21.4.2785