Deteksi Diabetes Mellitus dengan Menggunakan Teknik Ensemble XGBoost dan LightGBM

Authors

  • Naufal Adhi Pratama Universitas Dian Nuswantoro
  • Danang Wahyu Utomo Universitas Dian Nuswantoro

DOI:

https://doi.org/10.14421/jiska.4908

Keywords:

Diabetes Mellitus, Machine Learning, XGBoost, LightGBM, Early Detection

Abstract

Diabetes mellitus is a metabolic disease characterized by elevated blood sugar levels due to impaired insulin secretion, insulin action, or both. The disease has a major impact on public health and contributes to high morbidity and mortality rates in many countries. Prevention and early detection are essential to reduce the adverse effects of this disease. This study aims to analyze and apply machine learning algorithms in detecting diabetes mellitus, focusing on the use of XGBoost and LightGBM algorithms. The dataset used in this study includes various features related to diabetes risk factors, such as age, gender, body mass index (BMI), hypertension, smoking history, and HbA1c and blood glucose levels. Preprocessing was performed to clean and balance the data using the SMOTE-Tomek technique. Next, the model was built and evaluated using the K-Fold cross-validation method to measure the accuracy and stability of the model. The results showed that the XGBoost model achieved 97.31% accuracy, while the LightGBM model produced 97.26% accuracy. Combining the two models through blending techniques resulted in an accuracy of 97.51%, indicating that the combination of models can improve prediction performance. This study shows the great potential of machine learning algorithms, especially XGBoost and LightGBM, in detecting diabetes mellitus accurately and efficiently. Hopefully, the results of this study can contribute to the development of decision support systems for more effective early diagnosis of diabetes.

References

Alam, U., Asghar, O., Azmi, S., & Malik, R. A. (2014). General aspects of diabetes mellitus. In Handbook of Clinical Neurology (Vol. 4, pp. 211–222). https://doi.org/10.1016/B978-0-444-53480-4.00015-1

Azmi, S. S., & Baliga, S. (2020). An Overview of Boosting Decision Tree Algorithms utilizing AdaBoost and XGBoost Boosting strategies. International Research Journal of Engineering and Technology, 7(5), 6867–6870. https://www.irjet.net/archives/V7/i5/IRJET-V7I51293.pdf

Butt, U. M., Letchmunan, S., Ali, M., Hassan, F. H., Baqir, A., & Sherazi, H. H. R. (2021). Machine Learning Based Diabetes Classification and Prediction for Healthcare Applications. Journal of Healthcare Engineering, 2021, 1–17. https://doi.org/10.1155/2021/9930985

Chang, V., Bailey, J., Xu, Q. A., & Sun, Z. (2023). Pima Indians diabetes mellitus classification based on machine learning (ML) algorithms. Neural Computing and Applications, 35(22), 16157–16173. https://doi.org/10.1007/s00521-022-07049-z

Fareed, M. M. S., Zikria, S., Ahmed, G., Mui-Zzud-Din, Mahmood, S., Aslam, M., Jillani, S. F., Moustafa, A., & Asad, M. (2022). ADD-Net: An Effective Deep Learning Model for Early Detection of Alzheimer Disease in MRI Scans. IEEE Access, 10, 96930–96951. https://doi.org/10.1109/ACCESS.2022.3204395

Galicia-garcia, U., Benito-vicente, A., Jebari, S., & Larrea-sebal, A. (2020). Costus ignus: Insulin plant and it’s preparations as remedial approach for diabetes mellitus. International Journal of Molecular Sciences, 1–34. https://doi.org/10.13040/IJPSR.0975-8232.13(4).1551-58

Gomes, H. M., Barddal, J. P., Enembreck, F., & Bifet, A. (2018). A Survey on Ensemble Learning for Data Stream Classification. ACM Computing Surveys, 50(2), 1–36. https://doi.org/10.1145/3054925

Kahloot, K. M., & Ekler, P. (2021). Algorithmic Splitting: A Method for Dataset Preparation. IEEE Access, 9, 125229–125237. https://doi.org/10.1109/ACCESS.2021.3110745

Kharis, S. A. A., & Zili, A. H. A. (2022). Learning Analytics dan Educational Data Mining pada Data Pendidikan. JURNAL RISET PEMBELAJARAN MATEMATIKA SEKOLAH, 6(1), 12–20. https://doi.org/10.21009/jrpms.061.02

Kumar, M., Singhal, S., Shekhar, S., Sharma, B., & Srivastava, G. (2022). Optimized Stacking Ensemble Learning Model for Breast Cancer Detection and Classification Using Machine Learning. Sustainability, 14(21), Article ID: 13998. https://doi.org/10.3390/su142113998

Lai, H., Huang, H., Keshavjee, K., Guergachi, A., & Gao, X. (2019). Predictive models for diabetes mellitus using machine learning techniques. BMC Endocrine Disorders, 19(1), Article ID: 101. https://doi.org/10.1186/s12902-019-0436-6

Machado, M. R., Karray, S., & de Sousa, I. T. (2019). LightGBM: an Effective Decision Tree Gradient Boosting Method to Predict Customer Loyalty in the Finance Industry. 2019 14th International Conference on Computer Science & Education (ICCSE), 1111–1116. https://doi.org/10.1109/ICCSE.2019.8845529

Manconi, A., Armano, G., Gnocchi, M., & Milanesi, L. (2022). A Soft-Voting Ensemble Classifier for Detecting Patients Affected by COVID-19. Applied Sciences, 12(15), Article ID: 7554. https://doi.org/10.3390/app12157554

Mengcan, M., Xiaofang, C., & Yongfang, X. (2021). Constrained voting extreme learning machine and its application. Journal of Systems Engineering and Electronics, 32(1), 209–219. https://doi.org/10.23919/JSEE.2021.000018

Mujumdar, A., & Vaidehi, V. (2019). Diabetes Prediction using Machine Learning Algorithms. Procedia Computer Science, 165, 292–299. https://doi.org/10.1016/j.procs.2020.01.047

Muljono, Wulandari, S. A., Azies, H. Al, Naufal, M., Prasetyanto, W. A., & Zahra, F. A. (2024). Breaking Boundaries in Diagnosis: Non-Invasive Anemia Detection Empowered by AI. IEEE Access, 12(2023), 9292–9307. https://doi.org/10.1109/ACCESS.2024.3353788

Ogurtsova, K., da Rocha Fernandes, J. D., Huang, Y., Linnenkamp, U., Guariguata, L., Cho, N. H., Cavan, D., Shaw, J. E., & Makaroff, L. E. (2017). IDF Diabetes Atlas: Global estimates for the prevalence of diabetes for 2015 and 2040. Diabetes Research and Clinical Practice, 128, 40–50. https://doi.org/10.1016/j.diabres.2017.03.024

Rif’at, I. D., Hasneli N, Y., & Indriati, G. (2023). GAMBARAN KOMPLIKASI DIABETES MELITUS PADA PENDERITA DIABETES MELITUS. Jurnal Keperawatan Profesional, 11(1), 52–69. https://doi.org/10.33650/jkp.v11i1.5540

Sari, L., Romadloni, A., Lityaningrum, R., & Hastuti, H. D. (2023). Implementation of LightGBM and Random Forest in Potential Customer Classification. TIERS Information Technology Journal, 4(1), 43–55. https://doi.org/10.38043/tiers.v4i1.4355

Saxena, R., Sharma, S. K., Gupta, M., & Sampada, G. C. (2022). A Novel Approach for Feature Selection and Classification of Diabetes Mellitus: Machine Learning Methods. Computational Intelligence and Neuroscience, 2022(2), 1–11. https://doi.org/10.1155/2022/3820360

Sepbriant, G. D., & Utomo, D. W. (2024). Ensemble Learning pada Kategorisasi Produk E-Commerce Menggunakan Teknik Boosting. JISKA (Jurnal Informatika Sunan Kalijaga), 9(2), 123–133. https://doi.org/10.14421/jiska.2024.9.2.123-133

Tanwar, A., & Bhatia, P. K. (2024). A Review on Diabetes Prediction Using Machine Learning Techniques. In Lecture Notes in Electrical Engineering (Vol. 1185, Issue 09, pp. 513–524). https://doi.org/10.1007/978-981-97-1682-1_41

Thohari, A. N. A., Karima, A., Santoso, K., & Rahmawati, R. (2024). Crack Detection in Building Through Deep Learning Feature Extraction and Machine Learning Approch. Journal of Applied Informatics and Computing, 8(1), 1–6. https://doi.org/10.30871/jaic.v8i1.7431

Wang, Z., Wu, C., Zheng, K., Niu, X., & Wang, X. (2019). SMOTETomek-Based Resampling for Personality Recognition. IEEE Access, 7, 129678–129689. https://doi.org/10.1109/ACCESS.2019.2940061

Zhang, H., Liu, C., Zhang, Z., Xing, Y., Liu, X., Dong, R., He, Y., Xia, L., & Liu, F. (2021). Recurrence Plot-Based Approach for Cardiac Arrhythmia Classification Using Inception-ResNet-v2. Frontiers in Physiology, 12, 1–13. https://doi.org/10.3389/fphys.2021.648950

Downloads

Published

2025-11-13

How to Cite

Pratama, N. A., & Utomo, D. W. (2025). Deteksi Diabetes Mellitus dengan Menggunakan Teknik Ensemble XGBoost dan LightGBM. JISKA (Jurnal Informatika Sunan Kalijaga). https://doi.org/10.14421/jiska.4908

Issue

Section

Articles