Comparative Study of K-Means Clustering Algorithm and K-Medoids Clustering in Student Data Clustering
DOI:
https://doi.org/10.14421/jiska.2022.7.2.91-99Keywords:
Data Mining, Data Pre-Processing, Validation Test, Davies Bouldin Index, OptimalAbstract
Universities as educational institutions have very large amounts of academic data which may not be used properly. The data needs to be analyzed to produce information that can map the distribution of students. Student academic data processing utilizes data mining processes using clustering techniques, K-Means and K-Medoids. This study aims to implement and analyze the comparison of which algorithm is more optimal based on the cluster validation test with the Davies Bouldin Index. The data used are academic data of UIN Sunan Kalijaga students in the 2013-2015 batch. In the k-Means process, the best number of clusters is 5 with a DBI value of 0.781. In the k-Medoids process, the best number of clusters is 3 with a DBI value of 0.929. Based on the value of the DBI validation test, the k-Means algorithm is more optimal than the k-Medoids. So that the cluster of students with the highest average GPA of 3,325 is 401 students.
References
Alhamdani, F. D. S., Dianti, A. A., & Azhar, Y. (2021). Segmentasi Pelanggan Berdasarkan Perilaku Penggunaan Kartu Kredit Menggunakan Metode K-Means Clustering. JISKA (Jurnal Informatika Sunan Kalijaga), 6(2), 70–77. https://doi.org/10.14421/jiska.2021.6.2.70-77
Davies, D. L., & Bouldin, D. W. (1979). A Cluster Separation Measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-1(2), 224–227. https://doi.org/10.1109/TPAMI.1979.4766909
Farissa, R. A., Mayasari, R., & Umaidah, Y. (2021). Perbandingan Algoritma K-Means dan K-Medoids Untuk Pengelompokkan Data Obat dengan Silhouette Coefficient di Puskesmas Karangsambung. Journal of Applied Informatics and Computing, 5(2), 109–116. https://doi.org/10.30871/jaic.v5i1.3237
Fitriyadi, A. U. (2021). Algoritma K-Means dan K-Medoids Analisis Algoritma K-Means dan K-Medoids Untuk Clustering Data Kinerja Karyawan Pada Perusahaan Perumahan Nasional. KILAT, 10(1), 157–168. https://doi.org/10.33322/kilat.v10i1.1174
Iskandar, I. D., Pertiwi, M. W., Kusmira, M., & Amirulloh, I. (2018). Komparasi Algoritma Clustering Data Media Online. IKRA-ITH INFORMATIKA : Jurnal Komputer Dan Informatika, 2(3), 1–8.
Kharisma, R. B., & Yazid, A. S. (2018). The Mapping of Access Point Workloads at UIN Sunan Kalijaga Based on Log Analysis using K-Means Algorithm. IJID (International Journal on Informatics for Development), 6(1), 17. https://doi.org/10.14421/ijid.2017.06105
Kusrini, E. T. L., & Taufiq, E. (2019). Algoritma Data Mining. Penerbit Andi.
Muhammad, A. F. (2015). Klasterisasi Proses Seleksi Pemain Menggunakan Algoritma K-Means (Study Kasus : Tim Hockey Kabupaten Kendal). Universitas Dian Nuswantoro.
Ningsih, W. A., Indriani, F., & Farmadi, A. (2019). Klasifikasi Detak Jantung Janin Dengan Learning Vector Quantization (LVQ). Seminar Nasional Ilmu Komputer (SOLITER), 2, 130–135.
Nurhayati, Sinatrya, N. S., Wardhani, L. K., & Busman. (2018). Analysis of K-Means and K-Medoids’s Performance Using Big Data Technology. 2018 6th International Conference on Cyber and IT Service Management (CITSM), 1–5. https://doi.org/10.1109/CITSM.2018.8674251
Oktarina, C., Notodiputro, K. A., & Indahwati, I. (2020). Comparison of K-Means Clustering Method and K-Medoids on Twitter Data. Indonesian Journal of Statistics and Its Applications, 4(1), 189–202. https://doi.org/10.29244/ijsa.v4i1.599
Pramesti, D. F., Furqon, M. T., & Dewi, C. (2017). Implementasi Metode K-Medoids Clustering Untuk Pengelompokan Data. Jurnal Pengembangan Teknologi Informasi Dan Ilmu Komputer, 1(9), 723–732.
Santosa, B. (2007). Teknik Pemanfaatan Data untuk Keperluan Bisnis (1st ed.). Graha Ilmu.
Sindi, S., Ningse, W. R. O., Sihombing, I. A., R.H.Zer, F. I., & Hartama, D. (2020). Analisis Algoritma K-Medoids Clustering dalam Pengelompokan Penyebaran Covid-19 di Indonesia. Jurnal Teknologi Informasi, 4(1), 166–173. https://doi.org/10.36294/jurti.v4i1.1296
Sudijono, A. (2010). Pengantar Statistik Pendidikan. RajaGrafindo Persada.
Sugriyono, S., & Siregar, M. U. (2020). Preprocessing kNN algorithm classification using K-means and distance matrix with students’ academic performance dataset. Jurnal Teknologi Dan Sistem Komputer, 8(4). https://doi.org/10.14710/jtsiskom.2020.13874
Supriyadi, A., Triayudi, A., & Sholihati, I. D. (2021). Perbandingan Algoritma K-Means dengan K-Medoids pada Pengelompokan Armada Kendaraan Truk Berdasarkan Produktivitas. JIPI (Jurnal Ilmiah Penelitian Dan Pembelajaran Informatika), 6(2), 229–240. https://doi.org/10.29100/jipi.v6i2.2008
Susanto, B. (2013). Data Preprocessing.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Qomariyah, Maria Ulfah Siregar
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Authors who publish with this journal agree to the following terms as stated in http://creativecommons.org/licenses/by-nc/4.0
a. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
b. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
c. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.