Optimizing K-Means Algorithm Using the Purity Method for Clustering Oil Palm Producing Regions in North Aceh

Authors

  • Novia Hasdyna Universitas Islam Kebangsaan Indonesia
  • Rozzi Kesuma Dinata Universitas Malikussaleh
  • Balqis Yafis National Yang Ming Chiao Tung University

DOI:

https://doi.org/10.14421/jiska.2025.10.1.1-15

Keywords:

K-Means Algorithm, Purity Method, Data Clustering, Oil Palm Production, North Aceh, Davies-Bouldin Index (DBI)

Abstract

The K-Means algorithm is a fundamental tool in machine learning, widely utilized for data clustering tasks. This research aims to improve the performance of the K-Means algorithm by integrating the Purity method, specifically focusing on clustering regions renowned for oil palm production in North Aceh. Oil palm cultivation is a vital agricultural sector in North Aceh, contributing significantly to the local economy and employment. This study examines two clustering techniques: the conventional K-Means algorithm and an optimized version, Purity K-Means. Integrating the Purity method increases K-Means' efficiency by decreasing the required convergence iteration. The data used for clustering analysis is sourced from the Department of Agriculture and Food in North Aceh Regency and pertains to oil palm production in 2023. The findings indicate that the Purity K-Means approach notably reduces the iteration count and improves cluster quality. The average Davies-Bouldin Index (DBI) for standard K-Means is 0.45, whereas the Purity K-Means method lowers it to 0.30. Furthermore, applying the Purity method reduced the number of K-Means iterations from 15 to just 3. These results highlight an enhancement in clustering performance and overall efficiency.

References

Ariyanto, Y., Sabilla, W. I., & As Sidiq, Z. S. (2024). Recommendation System for Clustering to Allocate Classes for New Students Using The K-Means Method. Compiler, 13(1), 27. https://doi.org/10.28989/compiler.v13i1.1962

Bhatti, M. A., Zeeshan, Z., M.S., S., Bhatti, U. A., Khan, A., Ghadi, Y. Y., Alsenan, S., Li, Y., Asif, M., & Afzal, T. (2024). Advanced Plant Disease Segmentation in Precision Agriculture Using Optimal Dimensionality Reduction With Fuzzy C-Means Clustering and Deep Learning. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 17, 18264–18277. https://doi.org/10.1109/JSTARS.2024.3437469

Cebolla-Alemany, J., Macarulla Martí, M., Viana, M., Moreno-Martín, V., San Félix, V., & Bou, D. (2024). Optimizing indoor air models through k-means clustering of nanoparticle size distribution data. Building and Environment, 266, 112091. https://doi.org/10.1016/j.buildenv.2024.112091

Dinata, R. K., Adek, R. T., Hasdyna, N., & Retno, S. (2023). K-Nearest Neighbor Classifier Optimization Using Purity. AIP Conference Proceedings, 2431(1). https://doi.org/10.1063/5.0117058/2906121

Ezugwu, A. E., Ikotun, A. M., Oyelade, O. O., Abualigah, L., Agushaka, J. O., Eke, C. I., & Akinyelu, A. A. (2022). A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Engineering Applications of Artificial Intelligence, 110, 104743. https://doi.org/10.1016/j.engappai.2022.104743

Hasdyna, N., & Dinata, R. K. (2024). Comparative Analysis of K-Medoids and Purity K-Medoids Methods for Identifying Accident-Prone Areas in North Aceh Regency. Scientific Journal of Informatics, 11(2), 263–272. https://doi.org/10.15294/SJI.V11I2.3433

Henderi, H., Fitriana, L., Iskandar, I., Astuti, R., Arifandy, M. I., Hayadi, B. H., Mesran, M., Chin, J., & Kurniawan, A. (2024). Optimization of Davies-Bouldin Index with k-medoids algorithm. Science and Technology Research Symposium 2022, 3065(1), 030002. https://doi.org/10.1063/5.0225220/3311944

Kouadio, K. L., Liu, J., Liu, R., Wang, Y., & Liu, W. (2024). K-Means Featurizer: A booster for intricate datasets. Earth Science Informatics, 17(2), 1203–1228. https://doi.org/10.1007/S12145-024-01236-3/METRICS

Li, M., Frank, E., & Pfahringer, B. (2023). Large scale K-means clustering using GPUs. Data Mining and Knowledge Discovery, 37(1), 67–109. https://doi.org/10.1007/S10618-022-00869-6/TABLES/22

Majumdar, P., Bhattacharya, D., Mitra, S., Solgi, R., Oliva, D., & Bhusan, B. (2023). Demand prediction of rice growth stage-wise irrigation water requirement and fertilizer using Bayesian genetic algorithm and random forest for yield enhancement. Paddy and Water Environment, 21(2), 275–293. https://doi.org/10.1007/S10333-023-00930-0/METRICS

Naz, H., Saba, T., Alamri, F. S., Almasoud, A. S., & Rehman, A. (2024). An Improved Robust Fuzzy Local Information K-Means Clustering Algorithm for Diabetic Retinopathy Detection. IEEE Access, 12, 78611–78623. https://doi.org/10.1109/ACCESS.2024.3392032

Retno, S., Hasdyna, N., & Yafis, B. (2024). K-NN with Purity Algorithm to Enhance the Classification of the Air Quality Dataset. Journal of Advanced Computer Knowledge and Algorithms, 1(2), 42–46. https://doi.org/10.29103/jacka.v1i2.15890

Rezaee, L., Davatgar, N., Moosavi, A. A., & Sepaskhah, A. R. (2023). Implications of Spatial Variability of Soil Physical Attributes in Delineating Site-Specific Irrigation Management Zones for Rice Crop. Journal of Soil Science and Plant Nutrition, 23(4), 6596–6611. https://doi.org/10.1007/S42729-023-01513-Y/METRICS

Ros, F., Riad, R., & Guillaume, S. (2023). PDBI: A partitioning Davies-Bouldin index for clustering evaluation. Neurocomputing, 528, 178–199. https://doi.org/10.1016/j.neucom.2023.01.043

Thakur, B., & Kaur, S. (2024). The Role of Artificial Intelligence in Biofertilizer Development. In Metabolomics, Proteomics and Gene Editing Approaches in Biofertilizer Industry (pp. 157–176). Springer Nature Singapore. https://doi.org/10.1007/978-981-97-2910-4_9

Downloads

Published

2025-01-31

How to Cite

Hasdyna, N., Dinata, R. K., & Yafis, B. (2025). Optimizing K-Means Algorithm Using the Purity Method for Clustering Oil Palm Producing Regions in North Aceh. JISKA (Jurnal Informatika Sunan Kalijaga), 10(1), 1–15. https://doi.org/10.14421/jiska.2025.10.1.1-15