An Optimal Framework Based on the GentleBoost Algorithm and Bayesian Optimization for the Prediction of Breast Cancer Patients' Survivability

Authors

  • Ayman Alsabry
  • Malek Algabri
  • Amin Mohamed Ahsan
  • Mogeeb A. A. Mosleh
  • F. E. Hanash
  • Hamzah Ali Abdulrahman Qasem

DOI:

https://doi.org/10.47839/ijc.23.1.3439

Keywords:

Data Exploration, GentleBoost algorithm, Hyperparameters Tuning, Machine Learning, SEER breast cancer dataset

Abstract

Breast cancer is a primary cause of cancer-associated mortality among women globally, and early detection and personalized treatment are critical for improving patient outcomes. In this study, we propose an optimal framework for predicting breast cancer patient survivability using the GentleBoost algorithm and Bayesian optimization. The proposed framework combines the strengths of the GentleBoost algorithm, which is a powerful machine-learning algorithm for classification, and Bayesian optimization, which is a powerful optimization technique for hyperparameter tuning. We evaluated the proposed framework using the publicly available breast cancer dataset provided by The Surveillance, Epidemiology, and End Results (SEER) program and compared its performance with several popular single algorithms, including support vector machine (SVM), artificial neural network (ANN), and k-nearest neighbors (KNN). The experimental results demonstrate that the proposed framework outperforms these methods in terms of accuracy (mean= 95.16%, best = 95.35, worst = 95.1%, and SD = 0.008). The values of precision, recall, and f1-score of the best experiment were 92.3 %, 98.2 %, and 95.2 %, respectively, with hyperparameters of (number of learners = 246, learning rate = 0.0011, and maximum number of splits = 1240). The proposed framework has the potential to improve breast cancer patient survival predictions and personalized treatment plans, leading to the improved patient outcomes and reduced healthcare costs.

References

A. Iqbal and M. Sharif, "BTS-ST: Swin transformer network for segmentation and classification of multimodality breast cancer images," Knowledge-Based Systems, vol. 267, p. 110393, 2023. https://doi.org/10.1016/j.knosys.2023.110393.

World Health Organization. Breast cancer: Scope of the problem. [Online]. Available at: https://www.who.int/news-room/fact-sheets/detail/breast-cancer.

H. O. Al-Shamsi, I. H. Abu-Gheida, F. Iqbal, and A. Al-Awadhi, Cancer in the Arab world: Springer Nature, 2022. https://doi.org/10.1007/978-981-16-7945-2.

"American Cancer Society. Breast Cancer Facts & Figures 2021-2022," [Online]. Available at: https://www.cancer.org/content/dam/cancer-org/research/cancer-facts-and-statistics/breast-cancer-facts-and-figures/2022-2024-breast-cancer-fact-figures-acs.pdf.

S. Fox, V. Speirs, and A. M. Shaaban, "Male breast cancer: an update," Virchows Archiv, vol. 480, pp. 85-93, 2022. https://doi.org/10.1007/s00428-021-03190-7.

E. Garreffa and D. Arora, "Breast cancer in the elderly, in men and during pregnancy," Surgery (Oxford), vol. 40, pp. 139-146, 2022. https://doi.org/10.1016/j.mpsur.2021.11.018.

R. Singh, L. Cao, A. L. Sarode, M. Kharouta, R. Shenk, and M. E. Miller, "Trends in surgery and survival for T1-T2 male breast cancer: a study from the National Cancer Database," The American Journal of Surgery, vol. 225, pp. 75-83, 2023. https://doi.org/10.1016/j.amjsurg.2022.09.043.

A. N. Hurson, T. U. Ahearn, R. Keeman, M. Abubakar, A. Y. Jung, P. M. Kapoor, et al., "Systematic literature review of risk factor associations with breast cancer subtypes in women of African, Asian, Hispanic, and European descents," Cancer Research, vol. 82, pp. 3670-3670, 2022. https://doi.org/10.1158/1538-7445.AM2022-3670.

B. Smolarz, A. Z. Nowak, and H. Romanowicz, "Breast Cancer—Epidemiology, Classification, Pathogenesis and Treatment (Review of Literature)," Cancers, vol. 14, p. 2569, 2022. https://doi.org/10.3390/cancers14102569.

M. O. Abbas and M. Baig, "Knowledge and Practice Concerning Breast Cancer Risk Factors and Screening among Females in UAE," Asian Pacific Journal of Cancer Prevention, vol. 24, pp. 479-487, 2023. https://doi.org/10.31557/APJCP.2023.24.2.479.

A. Alsabry, M. Algabri, and A. M. Ahsan, "Breast Cancer-Risk Factors and Prediction Using Machine-Learning Algorithms and Data Source: A Review of Literature," JAST, vol. 1, 2023. https://doi.org/10.59628/jast.v1i2.361.

A. Alsabry and M. Algabri, "Iterative tuning of tree-ensemble-based models' parameters using Bayesian optimization for breast cancer prediction," Informatics and Automatization, vol. 23, pp. 129-168, 2024. https://doi.org/10.15622/ia.23.1.5.

A. Alsabry, M. Algabri, A. M. Ahsan, M. A. Mosleh, A. A. Ahmed, and H. A. Qasem, "Enhancing Prediction Models' Performance for Breast Cancer using SMOTE Technique," in 2023 3rd International Conference on Emerging Smart Technologies and Applications (eSmarTA), 2023, pp. 1-8. https://doi.org/10.1109/eSmarTA59349.2023.10293726.

A. Alsabry, M. Algabri, A. M. Ahsan, M. A. Mosleh, A. A. Ahmed, and H. A. Qasem, "Breast Cancer Prediction Framework Based on Iterative Optimization with Bayesian Hyperparameter Tuning," in 2023 3rd International Conference on Emerging Smart Technologies and Applications (eSmarTA), 2023, pp. 01-08. https://doi.org/10.1109/eSmarTA59349.2023.10293277.

C. E. Holmes and H. B. Muss, "Diagnosis and treatment of breast cancer in the elderly," CA: a cancer journal for clinicians, vol. 53, pp. 227-244, 2003. https://doi.org/10.3322/canjclin.53.4.227.

F. Cardoso, S. Kyriakides, S. Ohno, F. Penault-Llorca, P. Poortmans, I. Rubio, et al., "Early breast cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up," Annals of oncology, vol. 30, pp. 1194-1220, 2019. https://doi.org/10.1093/annonc/mdz173.

"National Cancer Institute: Breast Cancer Treatment," [Online]. Available at: https://www.cancer.gov/types/breast/hp/breast-treatment-pdq.

"American Cancer Society. (2022). Breast Cancer Early Detection," [Online]. Available at: https://www.cancer.org/cancer/breast-cancer/screening-tests-and-early-detection/american-cancer-society-recommendations-for-the-early-detection-of-breast-cancer.html.

J. A. Wernberg, J. Yap, C. Murekeyisoni, T. Mashtare, G. E. Wilding, and S. A. Kulkarni, "Multiple primary tumors in men with breast cancer diagnoses—a SEER database review," Journal of surgical oncology, vol. 99, pp. 16-19, 2009. https://doi.org/10.1002/jso.21153.

"E. The Surveillance, and End Results (SEER) program. SEER breast cancer data," [Online]. Available at: https://ieee-dataport.org/open-access/seer-breast-cancer-data.

P. D. Pharoah, A. C. Antoniou, D. F. Easton, and B. A. Ponder, "Polygenes, risk prediction, and targeted prevention of breast cancer," New England Journal of Medicine, vol. 358, pp. 2796-2803, 2008. https://doi.org/10.1056/NEJMsa0708739.

S. Bacha and O. Taouali, "A novel machine learning approach for breast cancer diagnosis," Measurement, vol. 187, p. 110233, 2022. https://doi.org/10.1016/j.measurement.2021.110233.

H. Lu, H. Wang, and S. W. Yoon, "A dynamic gradient boosting machine using genetic optimizer for practical breast cancer prognosis," Expert Systems with Applications, vol. 116, pp. 340-350, 2019. https://doi.org/10.1016/j.eswa.2018.08.040.

M. Huber, C. Kurz, and R. Leidl, "Predicting patient-reported outcomes following hip and knee replacement surgery using supervised machine learning," BMC medical informatics and decision making, vol. 19, pp. 1-13, 2019. https://doi.org/10.1186/s12911-018-0731-6.

S. Wang, Y. Wang, D. Wang, Y. Yin, Y. Wang, and Y. Jin, "An improved random forest-based rule extraction method for breast cancer diagnosis," Applied Soft Computing, vol. 86, p. 105941, 2020. https://doi.org/10.1016/j.asoc.2019.105941.

A. Kajala and S. Jaiswal, "Breast Cancer Survival Prediction from Imbalanced Dataset with Machine Learning Algorithms," Mathematical Statistician and Engineering Applications, vol. 71, pp. 167-172, 2022. [Online]. available at: https://www.philstat.org/index.php/MSEA/article/view/125.

M. N. Haque, T. Tazin, M. M. Khan, S. Faisal, S. M. Ibraheem, H. Algethami, et al., "Predicting characteristics associated with breast cancer survival using multiple machine learning approaches," Computational and Mathematical Methods in Medicine, vol. 2022, 2022. https://doi.org/10.1155/2022/1249692.

H. Wang, B. Zheng, S. W. Yoon, and H. S. Ko, "A support vector machine-based ensemble algorithm for breast cancer diagnosis," European Journal of Operational Research, vol. 267, pp. 687-699, 2018. https://doi.org/10.1016/j.ejor.2017.12.001.

G. Y. Özkan and S. Y. Gündüz, "Comparision of Classification Algorithims for Survival of Breast Cancer Patients," in 2020 Innovations in Intelligent Systems and Applications Conference (ASYU), 2020, pp. 1-4. https://doi.org/10.1109/ASYU50717.2020.9259846.

Downloads

Published

2024-04-01

How to Cite

Alsabry, A., Algabri, M., Ahsan, A. M., Mosleh, M. A. A., Hanash, F. E., & Qasem, H. A. A. (2024). An Optimal Framework Based on the GentleBoost Algorithm and Bayesian Optimization for the Prediction of Breast Cancer Patients’ Survivability. International Journal of Computing, 23(1), 85-93. https://doi.org/10.47839/ijc.23.1.3439

Issue

Section

Articles