Optimizing K-Means Algorithm Using the Purity Method for Clustering Oil Palm Producing Regions in North Aceh

Authors

  • Novia Hasdyna Universitas Islam Kebangsaan Indonesia
  • Rozzi Kesuma Dinata Universitas Malikussaleh
  • Balqis Yafis National Yang Ming Chiao Tung University

DOI:

https://doi.org/10.14421/jiska.2025.10.1.1-15

Keywords:

K-Means Algorithm, Purity Method, Data Clustering, Oil Palm Production, North Aceh, Davies-Bouldin Index (DBI)

Abstract

The K-Means algorithm is a fundamental tool in machine learning, widely utilized for data clustering tasks. This research aims to improve the performance of the K-Means algorithm by integrating the Purity method, specifically focusing on clustering regions renowned for oil palm production in North Aceh. Oil palm cultivation is a vital agricultural sector in North Aceh, contributing significantly to the local economy and employment. This study examines two clustering techniques: the conventional K-Means algorithm and an optimized version, Purity+K-Means. The integration of the Purity method increases the efficiency of K-Means by decreasing the required iterations for convergence. The data used for clustering analysis is sourced from the Department of Agriculture and Food in North Aceh Regency and pertains to oil palm production in 2023. The findings indicate that the Purity+K-Means approach notably reduces the iteration count and improves cluster quality. The average Davies-Bouldin Index (DBI) for standard K-Means is 0.45, whereas the Purity+K-Means method lowers it to 0.30. Furthermore, applying the Purity method reduced the number of K-Means iterations from 15 to just 3. These results highlight an enhancement in clustering performance and overall efficiency.

References

Ezugwu, A. E., Ikotun, A. M., Oyelade, O. O., Abualigah, L., Agushaka, J. O., Eke, C. I., & Akinyelu, A. A. (2022). A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Engineering Applications of Artificial Intelligence, 110, 104743.

Kouadio, K. L., Liu, J., Liu, R., Wang, Y., & Liu, W. (2024). K-Means Featurizer: A booster for intricate datasets. Earth Science Informatics, 17(2), 1203-1228.

Li, M., Frank, E., & Pfahringer, B. (2023). Large scale K-means clustering using GPUs. Data Mining and Knowledge Discovery, 37(1), 67-109.

Cebolla-Alemany, J., Martí, M. M., Viana, M., Moreno-Martín, V., San Félix, V., & Bou, D. (2024). Optimizing indoor air models through k-means clustering of nanoparticle size distribution data. Building and Environment, 112091.

Majumdar, P., Bhattacharya, D., Mitra, S., Solgi, R., Oliva, D., & Bhusan, B. (2023). Demand prediction of rice growth stage-wise irrigation water requirement and fertilizer using Bayesian genetic algorithm and random forest for yield enhancement. Paddy and Water Environment, 21(2), 275-293.

Rezaee, L., Davatgar, N., Moosavi, A. A., & Sepaskhah, A. R. (2023). Implications of spatial variability of soil physical attributes in delineating site-specific irrigation management zones for rice crop. Journal of Soil Science and Plant Nutrition, 23(4), 6596-6611.

Naz, H., Saba, T., Alamri, F. S., Almasoud, A. S., & Rehman, A. (2024). An Improved Robust Fuzzy Local Information K-Means Clustering Algorithm for Diabetic Retinopathy Detection. IEEE Acces.

Thakur, B., & Kaur, S. (2024). The Role of Artificial Intelligence in Biofertilizer Development. In Metabolomics, Proteomics and Gene Editing Approaches in Biofertilizer Industry: Volume II (pp. 157-176). Singapore: Springer Nature Singapore.

Bhatti, M. A., Zeeshan, Z., Syam, M. S., Bhatti, U. A., Khan, A., Ghadi, Y. Y., ... & Afzal, T. (2024). Advanced plant disease segmentation in precision agriculture using optimal dimensionality reduction with fuzzy c-means clustering and deep learning. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

Dinata, R. K., Adek, R. T., Hasdyna, N., & Retno, S. (2023, August). K-nearest neighbor classifier optimization using purity. In AIP Conference Proceedings (Vol. 2431, No. 1). AIP Publishing.

Hasdyna, N., & Dinata, R. K. (2024). Comparative Analysis of K-Medoids and Purity K-Medoids Methods for Identifying Accident-Prone Areas in North Aceh Regency. Scientific Journal of Informatics, 11(2), 263-272.

Ariyanto, Y., Sabilla, W. I., & Sidiq, Z. S. A. (2024). Recommendation System for Clustering to Allocate Classes for New Students Using The K-Means Method. Compiler, 13(1), 27-38.

Wibowo, N. L., Soeleman, M. A., & Fanani, A. Z. (2023). Antlion optimizer algorithm modification for initial centroid determination in K-means algorithm. Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), 7(4), 870-883.

Retno, S., Hasdyna, N., & Yafis, B. (2024). K-NN with Purity Algorithm to Enhance the Classification of the Air Quality Dataset. Journal of Advanced Computer Knowledge and Algorithms, 1(2), 42-46.

Ros, F., Riad, R., & Guillaume, S. (2023). PDBI: A partitioning Davies-Bouldin index for clustering evaluation. Neurocomputing, 528, 178-199.

Henderi, H., Fitriana, L., Iskandar, I., Astuti, R., Arifandy, M. I., Hayadi, B. H., ... & Kurniawan, A. (2024, September). Optimization of Davies-Bouldin Index with k-medoids algorithm. In AIP Conference Proceedings (Vol. 3065, No. 1). AIP Publishing.

Downloads

Published

2025-01-31

How to Cite

Hasdyna, N., Kesuma Dinata, R., & Yafis, B. (2025). Optimizing K-Means Algorithm Using the Purity Method for Clustering Oil Palm Producing Regions in North Aceh. JISKA (Jurnal Informatika Sunan Kalijaga), 10(1), 1–15. https://doi.org/10.14421/jiska.2025.10.1.1-15