Predicting Life Style of Early Diabetes Mellitus using Machine Learning Technique


  • Salliah Shafi Bhat
  • Venkatesan Selvam
  • Gufran Ahmad Ansari



Machine Learning Techniques, Diabetes Mellitus, Feature Engineering, SVC, LR, KNN, RF


A branch of artificial intelligence called Machine Learning (ML) enables machines to learn without having to be emphatically instructed. Machine Learning Techniques (MLT) have been used to forecast a variety of chronic diseases in the healthcare sector. Improvement in clinical approaches is necessary for early diabetes prediction to prevent complications and prolong the diagnosis of diabetes. Diabetes is growing fast in this world. In this paper MLT based Framework is recommended for early prediction of Diabetes Mellitus (DM). In this Paper the authors make use of PIDD data set. Different MLTs are used including Support Vector Classification (SVC), Logistic Regression (LR), K Nearest Neighbor (KNN) and Random Forest (RF). Data analysis is the first step in our method after which the information is transferred for data pre-processing and feature selection methods. RF performed better than other models with a 92.85 % accuracy rate followed by SVC (91.5%), LR (83.11) and KNN (89.6). K-fold cross-validation technique is utilized to verify the outcomes. The contribution of lifestyle characteristics is calculated using a feature engineering process. As a result, comprehensive overall comparative assessments of all the algorithms are performed taking into account variables such as accuracy, precision, sensitivity, recall, F1 score and ROC-AUC. The medical field can use the proposed framework to make early diabetes predictions. Additionally, it can be applied to other datasets that have data in common with diabetes.


World Health organization, 2022. Online. Available at:

M. Fang, D. Wang, J. Coresh, E. Selvin, “Trends in diabetes treatment and control in U.S. adults, 1999-2018,” N Engl J Med, vol. 384, no. pp. 2219–2228, 2021.

E. Ahlqvist, P. Storm, A. Karajamaki et al, “Novel subgroups of adult-onset diabetes and their association with outcomes: a data-driven cluster analysis of six variables,” Lancet Diabetes Endocrinal, vol. 6, issue 5, pp. 361–369, 2018.

A. F. Leite, K. D. F. Vasconcelos, H. Willems, R. Jacobs, “Radiomics and machine learning in oral healthcare,” PROTEOMICS–Clinical Applications, vol. 14, issue 3, 1900040, 2020.

R. C. Slieker, L. A. Donnelly, H. Fitipaldi et al, “Replication and cross-validation of type 2 diabetes subtypes based on clinical variables: an IMI-RHAPSODY study,” Diabetologia, vol. 64, issue 9, pp. 1982–1989, 2021.

O. P. R. Hamnvik, S. Agarwal, C. G. AhnAllen, A. L. Goldman, S. L. Reisner, “Telemedicine and inequities in health care access: the example of transgender health,” Transgender Health, vol. 7, issue 2, pp. 113-116, 2022.

S. Suyanto, S. Meliana, T. Wahyuningrum, S. Khomsah, “A new nearest neighbor-based framework for diabetes detection,” Expert Systems with Applications, vol. 199, 116857, 2022.

M. Ishi, J. Patil, V. Patil, “An efficient team prediction for one day international matches using a hybrid approach of CS-PSO and machine learning algorithms,” Array, vol. 14, 100144, 2022.

N. P. Miriyala, R. L. Kottapalli, G. P. Miriyala, G. Lorenzini, C. Ganteda, V. A. Bhogapurapu, “Diagnostic analysis of diabetes mellitus using machine learning approach,” Revue Intelligence Artificielle, vol. 36, issue 3, pp. 347-352, 2022.

S. W. Su, D. Wang, “Health-related quality of life and related factors among elderly persons under different aged care models in Guangzhou, China: a cross-sectional study,” Quality of Life Research, vol. 28, issue 5, pp. 1293-1303, 2019.

N. P. Tigga, S. Garg, “Prediction of type 2 diabetes using machine learning classification methods,” Procedia Computer Science, vol. 167, pp. 706-716, 2020.

U. E. Laila, K. Mahboob, A. W. Khan, F. Khan, W. Taekeun, “An ensemble approach to predict early-stage diabetes risk using machine learning: An empirical study,” Sensors, vol. 22, issue 14, 5247, 2022.

M. Maniruzzaman, M. Rahman, B. Ahammed, M. Abedin, “Classification and prediction of diabetes disease using machine learning paradigm,” Health information science and systems, vol. 8, issue 1, pp. 1-14, 2020.

A. Yahyaoui, A. Jamil, J. Rasheed, M. Yesiltepe, “A decision support system for diabetes prediction using machine learning and deep learning techniques,” Proceedings of the 2019 IEEE 1st International Informatics and Software Engineering Conference (UBMYK), November 2019, pp. 1-4.

S. Wei, X. Zhao, C. Miao, “A comprehensive exploration to the machine learning techniques for diabetes identification,” Proceedings of the 2018 IEEE 4th World Forum on Internet of Things (WF-IoT), February 2018, pp. 291-295.

S. S. Bhat, G. A. Ansari, “Predictions of diabetes and diet recommendation system for diabetic patients using machine learning techniques,” Proceedings of the 2021 IEEE 2nd International Conference for Emerging Technology (INCET), May 2021, pp. 1-5.

E. Bonora, M. Trombetta, M. Dauriz et al, “Chronic complications in patients with newly diagnosed type 2 diabetes: prevalence and related metabolic and clinical features: the Verona newly diagnosed type 2 diabetes study (VNDS) 9,” BMJ Open Diabetes Res Care, vol. 8, issue 1, e001549, 2020.

M. Battaglia, S. Ahmed, M. S. Anderson et al, “Introducing the endotype concept to address the challenge of disease heterogeneity in type 1 diabetes,” Diabetes Care, vol. 43, issue 1, pp. 5–12, 2020.

Y. Li, Y. Yang, J. Che, L. Zhang, “Predicting the number of nearest neighbor for kNN classifier,” IAENG International Journal of Computer Science, vol. 46, issue 4, pp. 662-669, 2019.

E. R. Pearson, “Type 2 diabetes: a multifaceted disease,” Diabetologia, vol. 62, issue 7, pp. 1107–1112, 2019.

C. Ferhatoglu, B. A. Miller, “Choosing feature selection methods for spatial modeling of soil fertility properties at the field scale,” Agronomy, vol. 12, issue 8, 1786, 2022.

M. Kowsher, M. Y. Turaba, T. Sajed, M. M. Rahman, “Prognosis and treatment prediction of type-2 diabetes using deep neural network and machine learning classifiers,” Proceedings of the 2019 IEEE 22nd International Conference on Computer and Information Technology (ICCIT), December 2019, pp. 1-6.

R. Patil, K. Shah, “Assessment of risk of type 2 diabetes mellitus with stress as a risk factor using classification algorithms,” Int. J. Recent Technol. Eng., vol. 8, issue 4, pp. 11273-11277, 2019.

R. Sivakani, M. Syed Masood, “Analysis of COVID-19 and its impact on Alzheimer’s patient using machine learning techniques,” International Journal of Computing, vol. 21, issue 4, pp. 468-474, 2022.

G. A. Ansari, S. S. Bhat, “Exploring a link between fasting perspective and different patterns of diabetes using a machine learning approach,” Educational Research, vol. 12, no. 2, pp. 500-517, 2022.

S. S. Bhat, V. Selvam, G. A. Ansari, M. D. Ansari, M. H. Rahman, “Prevalence and early prediction of diabetes using machine learning in North Kashmir: a case study of district Bandipora,” Computational Intelligence and Neuroscience, vol. 2022, Article ID 2789760, 2022.




How to Cite

Shafi Bhat, S., Selvam, V., & Ansari, G. A. (2023). Predicting Life Style of Early Diabetes Mellitus using Machine Learning Technique. International Journal of Computing, 22(3), 345-351.