Algoritma Random Forest dan Synthetic Minority Oversampling Technique (SMOTE) untuk Deteksi Diabetes

Authors

  • Nurussakinah Nurussakinah UIN Maulana Malik Ibrahim Malang
  • Muhammad Faisal UIN Maulana Malik Ibrahim Malang
  • Irwan Budi Santoso UIN Maulana Malik Ibrahim Malang

DOI:

https://doi.org/10.14421/jiska.2025.10.2.221-234

Keywords:

Detection, Diabetes, Random Forest, Synthetic Minority Oversampling Technique, Ensemble

Abstract

Diabetes is one of the challenges in global health. Indonesia ranks 5th in the world with the highest rate of diabetes. This research uses the Random Forest algorithm for diabetes detection. The purpose of this study is to detect diabetes using the Random Forest algorithm, which provides accurate and efficient results in the early diagnosis of diabetic patients. The data used is secondary data, specifically the “Diabetes Dataset,” which consists of 952 data points and has 17 features. The test scenario in this study divides the data into three parts, namely scenario 1 (90:10 ratio), scenario 2 (70:30 ratio), and scenario 3 (50:50 ratio). In each scenario, a comparison is made between using SMOTE and not using it. The best performance results are obtained in scenario 1, which uses SMOTE, producing 97% accuracy, 100% precision, 94% recall, and an F1-score of 97%.

References

Aris, F., & Benyamin, B. (2019). Penerapan Data Mining untuk Identifikasi Penyakit Diabetes Melitus dengan Menggunakan Metode Klasifikasi. Router Research, 1(1), 1–6. https://doi.org/10.29239/j.router.2019.313

Daghistani, T., & Alshammari, R. (2020). Comparison of Statistical Logistic Regression and RandomForest Machine Learning Techniques in Predicting Diabetes. Journal of Advances in Information Technology, 11(2), 78–83. https://doi.org/10.12720/jait.11.2.78-83

Elfaladonna, F., & Rahmadani, A. (2019). Analisa Metode Classification-Decision Tree dan Algoritma C.45 untuk Memprediksi Penyakit Diabetes dengan Menggunakan Aplikasi Rapid Miner. SINTECH (Science and Information Technology) Journal, 2(1), 10–17. https://doi.org/10.31598/sintechjournal.v2i1.293

Faida, A. N., & Santik, Y. D. P. (2020). Kejadian Diabetes Melitus Tipe I pada Usia 10-30 Tahun. Higeia Journal of Public Health Research and Development, 4(1), 33–42. https://doi.org/10.15294/higeia/v4i1/31763

Hana, F. M. (2020). Klasifikasi Penderita Penyakit Diabetes Menggunakan Algoritma Decision Tree C4.5. Jurnal SISKOM-KB (Sistem Komputer dan Kecerdasan Buatan), 4(1), 32–39. https://doi.org/10.47970/siskom-kb.v4i1.173

Junus, C. Z. V., Tarno, T., & Kartikasari, P. (2023). Klasifikasi Menggunakan Metode Support Vector Machine dan Random Forest untuk Deteksi Awal Risiko Diabetes Melitus. Jurnal Gaussian, 11(3), 386–396. https://doi.org/10.14710/j.gauss.11.3.386-396

Karyadiputra, E., & Setiawan, A. (2022). Penerapan Data Mining untuk Prediksi Awal Kemungkinan Terindikasi Diabetes. Teknosains: Media Informasi Sains dan Teknologi, 16(2), 221–232. https://doi.org/10.24252/teknosains.v16i2.28257

Kementerian Kesehatan Republik Indonesia. (2023). Rencana Aksi Kerja Kegiatan Direktorat P2PTM 2021-2024 (1st ed.). Kementerian Kesehatan Republik Indonesia. https://www.scribd.com/document/757455987/RAK-Dit-P2PTM-1-465827-02-4tahunan-070

Li, Y., & Mu, Y. (2024). Research and Performance Analysis of Random Forest-Based Feature Selection Algorithm in Sports Effectiveness Evaluation. Scientific Reports, 14(1), Article ID: 26275. https://doi.org/10.1038/s41598-024-76706-1

Magliano, D., & Boyko, E. J. (2013). Five Questions on the IDF Diabetes Atlas. Diabetes Research and Clinical Practice, 102(2), 147–148. https://doi.org/10.1016/j.diabres.2013.10.013

Mostafa, G., Mahmoud, H., Abd El-Hafeez, T., & ElAraby, M. E. (2024). The Power of Deep Learning in Simplifying Feature Selection for Hepatocellular Carcinoma: A Review. BMC Medical Informatics and Decision Making, 24(1), Article ID: 287. https://doi.org/10.1186/s12911-024-02682-1

Mulia, C., & Kurniasih, A. (2023). Teknik SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Bank Customer Churn Menggunakan Algoritma Naïve bayes dan Logistic Regression. Prosiding Seminar Ilmiah Nasional Online Mahasiswa Ilmu Komputer dan Aplikasinya, 4(2), 552–559. https://conference.upnvj.ac.id/index.php/senamika/article/view/2590

Rahman, M. S., Hossain, K. S., Das, S., Kundu, S., Adegoke, E. O., Rahman, Md. A., Hannan, Md. A., Uddin, M. J., & Pang, M.-G. (2021). Role of Insulin in Health and Disease: An Update. International Journal of Molecular Sciences, 22(12), Article ID: 6403. https://doi.org/10.3390/ijms22126403

Rajaraman, A., & Ullman, J. D. (2011). Data Mining. In Mining of Massive Datasets (Vol. 2, Issue January 2013, pp. 1–17). Cambridge University Press. https://doi.org/10.1017/CBO9781139058452.002

Sun, J., Hu, W., Ye, S., Deng, D., & Chen, M. (2023). The Description and Prediction of Incidence, Prevalence, Mortality, Disability-Adjusted Life Years Cases, and Corresponding Age-Standardized Rates for Global Diabetes. Journal of Epidemiology and Global Health, 13(3), 566–576. https://doi.org/10.1007/s44197-023-00138-9

Tigga, N. P., & Garg, S. (2020). Prediction of Type 2 Diabetes Using Machine Learning Classification Methods. Procedia Computer Science, 167, 706–716. https://doi.org/10.1016/j.procs.2020.03.336

Tulu, T. W., Wan, T. K., Chan, C. L., Wu, C. H., Woo, P. Y. M., Tseng, C. Z. S., Vodencarevic, A., Menni, C., & Chan, K. H. K. (2023). Machine Learning-Based Prediction of COVID-19 Mortality Using Immunological and Metabolic Biomarkers. BMC Digital Health, 1(1), Article ID: 6. https://doi.org/10.1186/s44247-022-00001-0

Witjaksana, E. C. P., Saedudin, Rd. R., & Widartha, V. P. (2021). Perbandingan Akurasi Algoritma Random Forest dan Algoritma Artificial Neural Network untuk Klasifikasi Penyakit Diabetes. EProceedings of Engineering, 8(5), 9773–9781. https://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/15758

Yang, Y., Khorshidi, H. A., & Aickelin, U. (2024). A Review on Over-Sampling Techniques in Classification of Multi-Class Imbalanced Datasets: Insights for Medical Problems. Frontiers in Digital Health, 6, Article ID: 1430245. https://doi.org/10.3389/fdgth.2024.1430245

Zailani, A. U., & Hanun, N. L. (2020). Penerapan Algoritma Klasifikasi Random Forest untuk Penentuan Kelayakan Pemberian Kredit di Koperasi Mitra Sejahtera. Infotech: Journal of Technology Information, 6(1), 7–14. https://doi.org/10.37365/jti.v6i1.61

Downloads

Published

2025-05-31

How to Cite

Nurussakinah, N., Faisal, M., & Santoso, I. B. (2025). Algoritma Random Forest dan Synthetic Minority Oversampling Technique (SMOTE) untuk Deteksi Diabetes. JISKA (Jurnal Informatika Sunan Kalijaga), 10(2), 221–234. https://doi.org/10.14421/jiska.2025.10.2.221-234

Issue

Section

Articles