Algoritma Random Forest dan Synthetic Minority Oversampling Technique (SMOTE) untuk Deteksi Diabetes
DOI:
https://doi.org/10.14421/jiska.2025.10.2.221-234Keywords:
Detection, Diabetes, Random Forest, Synthetic Minority Oversampling Technique, EnsembleAbstract
Diabetes is one of the challenges in global health. Indonesia ranks 5th in the world with the highest rate of diabetes. This research uses the Random Forest algorithm for diabetes detection. The purpose of this study is to detect diabetes using the Random Forest algorithm, which provides accurate and efficient results in the early diagnosis of diabetic patients. The data used is secondary data, specifically the “Diabetes Dataset,” which consists of 952 data points and has 17 features. The test scenario in this study divides the data into three parts, namely scenario 1 (90:10 ratio), scenario 2 (70:30 ratio), and scenario 3 (50:50 ratio). In each scenario, a comparison is made between using SMOTE and not using it. The best performance results are obtained in scenario 1, which uses SMOTE, producing 97% accuracy, 100% precision, 94% recall, and an F1-score of 97%.
References
Aris, F., & Benyamin, B. (2019). Penerapan Data Mining untuk Identifikasi Penyakit Diabetes Melitus dengan Menggunakan Metode Klasifikasi. Router Research, 1(1), 1–6. https://doi.org/10.29239/j.router.2019.313
Daghistani, T., & Alshammari, R. (2020). Comparison of Statistical Logistic Regression and RandomForest Machine Learning Techniques in Predicting Diabetes. Journal of Advances in Information Technology, 11(2), 78–83. https://doi.org/10.12720/jait.11.2.78-83
Elfaladonna, F., & Rahmadani, A. (2019). Analisa Metode Classification-Decision Tree dan Algoritma C.45 untuk Memprediksi Penyakit Diabetes dengan Menggunakan Aplikasi Rapid Miner. SINTECH (Science and Information Technology) Journal, 2(1), 10–17. https://doi.org/10.31598/sintechjournal.v2i1.293
Faida, A. N., & Santik, Y. D. P. (2020). Kejadian Diabetes Melitus Tipe I pada Usia 10-30 Tahun. Higeia Journal of Public Health Research and Development, 4(1), 33–42. https://doi.org/10.15294/higeia/v4i1/31763
Hana, F. M. (2020). Klasifikasi Penderita Penyakit Diabetes Menggunakan Algoritma Decision Tree C4.5. Jurnal SISKOM-KB (Sistem Komputer dan Kecerdasan Buatan), 4(1), 32–39. https://doi.org/10.47970/siskom-kb.v4i1.173
Junus, C. Z. V., Tarno, T., & Kartikasari, P. (2023). Klasifikasi Menggunakan Metode Support Vector Machine dan Random Forest untuk Deteksi Awal Risiko Diabetes Melitus. Jurnal Gaussian, 11(3), 386–396. https://doi.org/10.14710/j.gauss.11.3.386-396
Karyadiputra, E., & Setiawan, A. (2022). Penerapan Data Mining untuk Prediksi Awal Kemungkinan Terindikasi Diabetes. Teknosains: Media Informasi Sains dan Teknologi, 16(2), 221–232. https://doi.org/10.24252/teknosains.v16i2.28257
Kementerian Kesehatan Republik Indonesia. (2023). Rencana Aksi Kerja Kegiatan Direktorat P2PTM 2021-2024 (1st ed.). Kementerian Kesehatan Republik Indonesia. https://www.scribd.com/document/757455987/RAK-Dit-P2PTM-1-465827-02-4tahunan-070
Li, Y., & Mu, Y. (2024). Research and Performance Analysis of Random Forest-Based Feature Selection Algorithm in Sports Effectiveness Evaluation. Scientific Reports, 14(1), Article ID: 26275. https://doi.org/10.1038/s41598-024-76706-1
Magliano, D., & Boyko, E. J. (2013). Five Questions on the IDF Diabetes Atlas. Diabetes Research and Clinical Practice, 102(2), 147–148. https://doi.org/10.1016/j.diabres.2013.10.013
Mostafa, G., Mahmoud, H., Abd El-Hafeez, T., & ElAraby, M. E. (2024). The Power of Deep Learning in Simplifying Feature Selection for Hepatocellular Carcinoma: A Review. BMC Medical Informatics and Decision Making, 24(1), Article ID: 287. https://doi.org/10.1186/s12911-024-02682-1
Mulia, C., & Kurniasih, A. (2023). Teknik SMOTE untuk Mengatasi Imbalance Class dalam Klasifikasi Bank Customer Churn Menggunakan Algoritma Naïve bayes dan Logistic Regression. Prosiding Seminar Ilmiah Nasional Online Mahasiswa Ilmu Komputer dan Aplikasinya, 4(2), 552–559. https://conference.upnvj.ac.id/index.php/senamika/article/view/2590
Rahman, M. S., Hossain, K. S., Das, S., Kundu, S., Adegoke, E. O., Rahman, Md. A., Hannan, Md. A., Uddin, M. J., & Pang, M.-G. (2021). Role of Insulin in Health and Disease: An Update. International Journal of Molecular Sciences, 22(12), Article ID: 6403. https://doi.org/10.3390/ijms22126403
Rajaraman, A., & Ullman, J. D. (2011). Data Mining. In Mining of Massive Datasets (Vol. 2, Issue January 2013, pp. 1–17). Cambridge University Press. https://doi.org/10.1017/CBO9781139058452.002
Sun, J., Hu, W., Ye, S., Deng, D., & Chen, M. (2023). The Description and Prediction of Incidence, Prevalence, Mortality, Disability-Adjusted Life Years Cases, and Corresponding Age-Standardized Rates for Global Diabetes. Journal of Epidemiology and Global Health, 13(3), 566–576. https://doi.org/10.1007/s44197-023-00138-9
Tigga, N. P., & Garg, S. (2020). Prediction of Type 2 Diabetes Using Machine Learning Classification Methods. Procedia Computer Science, 167, 706–716. https://doi.org/10.1016/j.procs.2020.03.336
Tulu, T. W., Wan, T. K., Chan, C. L., Wu, C. H., Woo, P. Y. M., Tseng, C. Z. S., Vodencarevic, A., Menni, C., & Chan, K. H. K. (2023). Machine Learning-Based Prediction of COVID-19 Mortality Using Immunological and Metabolic Biomarkers. BMC Digital Health, 1(1), Article ID: 6. https://doi.org/10.1186/s44247-022-00001-0
Witjaksana, E. C. P., Saedudin, Rd. R., & Widartha, V. P. (2021). Perbandingan Akurasi Algoritma Random Forest dan Algoritma Artificial Neural Network untuk Klasifikasi Penyakit Diabetes. EProceedings of Engineering, 8(5), 9773–9781. https://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/15758
Yang, Y., Khorshidi, H. A., & Aickelin, U. (2024). A Review on Over-Sampling Techniques in Classification of Multi-Class Imbalanced Datasets: Insights for Medical Problems. Frontiers in Digital Health, 6, Article ID: 1430245. https://doi.org/10.3389/fdgth.2024.1430245
Zailani, A. U., & Hanun, N. L. (2020). Penerapan Algoritma Klasifikasi Random Forest untuk Penentuan Kelayakan Pemberian Kredit di Koperasi Mitra Sejahtera. Infotech: Journal of Technology Information, 6(1), 7–14. https://doi.org/10.37365/jti.v6i1.61
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Nurussakinah Nurussakinah, Muhammad Faisal, Irwan Budi Santoso

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors who publish with this journal agree to the following terms as stated in http://creativecommons.org/licenses/by-nc/4.0
a. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
b. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
c. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.




