Implementation of Cosine Similarity in an Automatic Classifier for Comments
DOI:
https://doi.org/10.14421/jiska.2018.32-05Abstract
Classification of text with a large amount is needed to extract the information contained in it. Student comments containing suggestions and criticisms about the lecturer and the lecture process on the learning evaluation system are not well classified, resulting in a difficult assessment process. So from that, we need a classification model that can classify comments automatically into classification categories. The method used is the Cosine Similarity method, which is a method for calculating similarities between two objects expressed in two vectors. The data used in this study were 1,630 comment data with several different categories. The test in this study uses k-fold cross-validation with k = 10. The results showed that the percentage accuracy of the classification model was 80.87%.
References
Ahmed, H., Razzaq, M. A., & Qamar, A. M. (2013). Prediction of popular tweets using Similarity Learning. ICET 2013 - 2013 IEEE 9th International Conference on Emerging Technologies. https://doi.org/10.1109/ICET.2013.6743524
Djamarah, & Zain. (2016). Strategi Belajar Mengajar. Jakarta: Rineka Cipta.
Habibi, M. (2017). Analisis Sentimen dan Klasifikasi Komentar Mahasiswa pada Sistem Evaluasi Pembelajaran Menggunakan Kombinasi KNN Berbasis Cosine Similarity dan Supervised Model. Departemen Ilmu Komputer dan Elektronika, Fakultas Matematika dan Ilmu Pengetahuan Alam. Universitas Gadjah Mada.
Haddi, E., Liu, X., & Shi, Y. (2013). The role of text pre-processing in sentiment analysis. Procedia Computer Science, 17, 26–32. https://doi.org/10.1016/j.procs.2013.05.005
Jayakodi, K., Bandara, M., & Meedeniya, D. (2016). An automatic classifier for exam questions with WordNet and Cosine similarity. 2nd International Moratuwa Engineering Research Conference, MERCon 2016, 12–17. https://doi.org/10.1109/MERCon.2016.7480108
Kadhim, A. I., Cheah, Y. N., Ahamed, N. H., & Salman, L. A. (2014). Feature extraction for co-occurrence-based cosine similarity score of text documents. 2014 IEEE Student Conference on Research and Development, SCOReD 2014, 2–5. https://doi.org/10.1109/SCORED.2014.7072954
Lops, P., Gemmis, M. De, & Semeraro, G. (2010). Recommender Systems Handbook. Recommender Systems Handbook. https://doi.org/10.1007/978-0-387-85820-3
Manning, C. D., Raghavan, P., & Schutze, H. (2009). An Introduction to Information Retrieval. Information Retrieval. https://doi.org/10.1109/LPT.2009.2020494
Margono. (2003). Metode Penelitian Pendidikan. Jakarta: Rineka Cipta.
Saipech, P., & Seresangtakul, P. (2018). Automatic Thai Subjective Examination using Cosine Similarity. ICAICTA 2018 - 5th International Conference on Advanced Informatics: Concepts Theory and Applications, 214–218. https://doi.org/10.1109/ICAICTA.2018.8541276
Sebastiani, F. (2002). Machine Learning in Automated Text Categorization. ACM Computing Surveys (CSUR), 34(1), 1–47.
Siqueira, H., & Barros, F. (2010). A Feature Extraction Process for Sentiment Analysis of Opinions on Services. Proceedings of the III International Workshop on Web and Text Intelligence (WTI).
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms as stated in http://creativecommons.org/licenses/by-nc/4.0
a. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
b. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
c. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.