Penerapan Data Mining dengan Metode Regresi Linear untuk Memprediksi Data Nilai Hasil Ujian Menggunakan RapidMiner

Authors

  • Muhammad Sholeh Institut Sains & Teknologi AKPRIND
  • Erna Kumalasari Nurnawati Institut Sains & Teknologi AKPRIND
  • Uning Lestari Institut Sains & Teknologi AKPRIND

DOI:

https://doi.org/10.14421/jiska.2023.8.1.10-21

Keywords:

Model, Data Mining, Linear Regression, RapidMiner, Datasheet

Abstract

Prediction is one of the methods in data mining. One of the models that can be used in prediction is using linear regression. Linear regression is used to make predictions on the data that has been provided. In this study, a linear regression model was made with a datasheet containing data that affected student achievement in achieving final exam scores. The linear regression model developed can be used to predict student test scores. The linear regression model developed can be used to predict student test scores. The datasheet used in the test uses a public datasheet, namely student_performance.csv. The datasheet consists of 395 records and 33 attributes. The attributes used are selected that influence the label. The selection of attributes is based on the results of the weighting in the process of checking the correlation matrix. Based on the weighting, the attributes used are seven attributes and one attribute becomes a label. The research method uses CRISP DM which consists of business understanding, data understanding, data preparation, model making, evaluation, and deploying. The data mining process uses the Rapid Miner application. The results of the study resulted in a linear regression model y=0.729-(0.024×Medu)-(0.020×Fedu)+(0.053×failures)-(0.077×goout)-(0.012×absences)+(0.126×G1)+(0.862×G2). The result of evaluating the performance of the RMSE value was 0.675. Based on these results, it can be concluded that the resulting model can be recommended for use in predicting student test scores.

References

Arhami, M., & Nasir, M. (2020). Data Mining - Algoritma dan Implementasi. Penerbit Andi. https://books.google.co.id/books/about/Data_Mining_Algoritma_dan_Implementasi.html?id=AtcCEAAAQBAJ&redir_esc=y

Ariesanto, A., & Ekka, P. (2020). Data Mining Menggunakan Regresi Linear untuk Prediksi Harga Saham Perusahaan Pelayaran. Jurnal Aplikasi Pelayaran Dan Kepelabuhanan, 10(2), 120. https://doi.org/10.30649/japk.v10i2.83

Bahri, S., Itb, A., & Dahlan, J. (2022). Implementasi Data Mining Untuk Menentukan Minat Siswa Dalam Menentukan Jurusan Pada Perguruan Tinggi. Jurnal Sistem Informasi (JUSIN), 3(1), 23–33. https://ojs.itb-ad.ac.id/index.php/JUSIN/article/view/1644

Chisholm, A. (2013). Exploring Data with RapidMiner (Vol. 1). Packt Publishing. https://www.perlego.com/book/390375/exploring-data-with-rapidminer-pdf

Deepika, K., & Sathyanarayana, N. (2018). Comparison Of Student Academic Performance On Different Educational Datasets Using Different Data Mining Techniques. International Journal of Computational Engineering Research (IJCER), 8(9), 28–38. http://www.ijceronline.com/papers/Vol8_issue9/Version-2/E0809022838.pdf

N., A. G., Singh, B. P., Sah, B., & Tiwari, D. (2019). Air Quality Index Prediction using Linear Regression. International Journal of Recent Technology and Engineering (IJRTE), 8(2), 4247–4252. https://doi.org/10.35940/ijrte.B2437.078219

Gaol, I. L. L., Sinurat, S., & Siagian, E. R. (2019). IMPLEMENTASI DATA MINING DENGAN METODE REGRESI LINEAR BERGANDA UNTUK MEMPREDIKSI DATA PERSEDIAAN BUKU PADA PT. YUDHISTIRA GHALIA INDONESIA AREA SUMATERA UTARA. KOMIK (Konferensi Nasional Teknologi Informasi Dan Komputer), 3(1). https://doi.org/10.30865/komik.v3i1.1579

Hendrian, S. (2018). Algoritma Klasifikasi Data Mining Untuk Memprediksi Siswa Dalam Memperoleh Bantuan Dana Pendidikan. Faktor Exacta, 11(3). https://doi.org/10.30998/faktorexacta.v11i3.2777

Hidayati, N., Suntoro, J., & Setiaji, G. G. (2021). Perbandingan Algoritma Klasifikasi untuk Prediksi Cacat Software dengan Pendekatan CRISP-DM. Jurnal Sains Dan Informatika, 7(2), 117–126. https://doi.org/10.34128/jsi.v7i2.313

Jollyta, D., Ramdhan, W., & Zarlis, M. (2020). Konsep Data Mining Dan Penerapan. In Konsep Data Mining Dan Penerapan. Deepublish. https://deepublishstore.com/shop/buku-konsep-data-mining-dan-penerapan/

Kurniatullah, B. D. F., & Pramudi, Y. T. C. (2017). Estimation of Students’ Graduation Using Multiple Linear Regression Method. Journal of Applied Intelligent System, 2(1), 29–36. https://doi.org/10.33633/jais.v2i1.1415

Kurniawan, R. (2016). Analisis Regresi. Dasar dan Penerapannya dengan R. Prenada Media. https://prenadamedia.com/product/analisis-regresi-dasar-dan-penerapannya-dengan-r/

Nishadi, A. S. T. (2019). Predicting Heart Diseases In Logistic Regression Of Machine Learning Algorithms By Python Jupyterlab. International Journal of Advanced Research and Publications, 3(8), 69–74. https://www.kaggle.com

Ofori, F., Maina, E., & Gitonga, R. (2020). Using Machine Learning Algorithms to Predict Students’ Performance and Improve Learning Outcome: A Literature Based Review. Journal of Information and Technology, 4(1), 2616–3573. https://stratfordjournals.org/journals/index.php/Journal-of-Information-and-Techn/article/view/480

Oyedeji, A. O., Salami, A. M., Folorunsho, O., & Abolade, O. R. (2020). Analysis and Prediction of Student Academic Performance Using Machine Learning. JITCE (Journal of Information Technology and Computer Engineering), 4(01), 10–15. https://doi.org/10.25077/jitce.4.01.10-15.2020

Prabha, D., Anindhitha, A., Archana, A., & Balaji, N. M. v. (2020). Predicting House Price Values Using Linear Regression with Ridge Regularization Approach. International Journal of Advanced Science and Technology, 29(9s), 5489–5495. http://sersc.org/journals/index.php/IJAST/article/view/18069

Prasetyo, V. R., Lazuardi, H., Mulyono, A. A., & Lauw, C. (2021). Penerapan Aplikasi RapidMiner Untuk Prediksi Nilai Tukar Rupiah Terhadap US Dollar Dengan Metode Linear Regression. Jurnal Nasional Teknologi Dan Sistem Informasi, 7(1), 8–17. https://doi.org/10.25077/TEKNOSI.v7i1.2021.8-17

Putro, M. F., Prayitno, E., Siregar, J., & Muharrom, M. (2021). PENERAPAN DATA MINING DENGAN NAÏVE BAYES UNTUK KLASIFIKASI SISWA SEKOLAH MENENGAH ATAS DALAM PENENTUAN PERGURUAN TINGGI. Akrab Juara : Jurnal Ilmu-Ilmu Sosial, 6(2), 306–312. https://doi.org/10.58487/AKRABJUARA.V6I2.1473

Rahayu, E., Parlina, I., & Siregar, Z. A. (2022). Application of Multiple Linear Regression Algorithm for Motorcycle Sales Estimation. JOMLAI: Journal of Machine Learning and Artificial Intelligence, 1(1), 1–10. https://doi.org/10.55123/jomlai.v1i1.142

Ramadhani, R., & Hendriyani, Y. (2021). Prediksi Prestasi Siswa Berbasis Data Mining Menggunakan Algoritma Decision Tree (Studi Kasus: SMKN 2 Padang). Voteteknika (Vocational Teknik Elektronika Dan Informatika), 9(3), 11. https://doi.org/10.24036/voteteknika.v9i3.112633

Setiyorini, T., & Asmono, R. T. (2020). IMPLEMENTATION OF GAIN RATIO AND K-NEAREST NEIGHBOR FOR CLASSIFICATION OF STUDENT PERFORMANCE. Jurnal Pilar Nusa Mandiri, 16(1), 19–24. https://doi.org/10.33480/pilar.v16i1.813

Sholeh, M., Suraya, S., & Andayati, D. (2022). Machine Linear untuk Analisis Regresi Linier Biaya Asuransi Kesehatan dengan Menggunakan Python Jupyter Notebook. JEPIN (Jurnal Edukasi Dan Penelitian Informatika), 8(1), 20–27. https://doi.org/10.26418/JP.V8I1.48822

Sinaga, W. A. L., Sumarno, S., & Sari, I. P. (2022). The Application of Multiple Linear Regression Method for Population Estimation Gunung Malela District. JOMLAI: Journal of Machine Learning and Artificial Intelligence, 1(1), 55–64. https://doi.org/10.55123/jomlai.v1i1.143

Siregar, A. Z. (2021). Implementasi Metode Regresi Linier Berganda Dalam Estimasi Tingkat Pendaftaran Mahasiswa Baru. Kesatria : Jurnal Penerapan Sistem Informasi (Komputer Dan Manajemen), 2(3), 133–137. https://doi.org/10.30645/KESATRIA.V2I3.73

Sudarsono, B. G., Leo, M. I., Santoso, A., & Hendrawan, F. (2021). ANALISIS DATA MINING DATA NETFLIX MENGGUNAKAN APLIKASI RAPID MINER. JBASE - Journal of Business and Audit Information Systems, 4(1), 13–21. https://doi.org/10.30813/jbase.v4i1.2729

Ünal, F. (2021). Data Mining for Student Performance Prediction in Education. In Data Mining - Methods, Applications and Systems. IntechOpen. https://doi.org/10.5772/intechopen.91449

Downloads

Published

2023-01-30

How to Cite

Sholeh, M., Nurnawati, E. K., & Lestari, U. (2023). Penerapan Data Mining dengan Metode Regresi Linear untuk Memprediksi Data Nilai Hasil Ujian Menggunakan RapidMiner. JISKA (Jurnal Informatika Sunan Kalijaga), 8(1), 10–21. https://doi.org/10.14421/jiska.2023.8.1.10-21