Comparative Analysis of Text Mining Classification Algorithms for English and Indonesian Qur’an Translation

Rahmat Hidayat; Sekar Minati

doi:10.14421/ijid.2019.08108

Vol. 8 No. 1 (2019), Articles

Vol. 8 No. 1 (2019)

Comparative Analysis of Text Mining Classification Algorithms for English and Indonesian Qur’an Translation

Articles

Published 2019-06-22

Rahmat Hidayat⁺⁻
Sekar Minati⁺⁻

Rahmat Hidayat

Sunan Kalijaga State Islamic University

Sekar Minati

Sunan Kalijaga State Islamic University

PDF

Keywords

SVM
Naïve Bayes
kNN
J48
text classification
Qur’an

How to Cite

Comparative Analysis of Text Mining Classification Algorithms for English and Indonesian Qur’an Translation. (2019). IJID (International Journal on Informatics for Development), 8(1), 47-51. https://doi.org/10.14421/ijid.2019.08108

Abstract

Qur'an, As-Sunnah, and Islamic old book have become the sources for Islam followers as sources of knowledge, wisdom, and law. But in daily life, there are still many Muslims who do not understand the meaning of the sentence in the Qur'an even though they read it every day. It becomes a challenge for Science and Engineering field academicians especially Informatics to explore and represent knowledge through intelligent system computing to answer various questions based on knowledge from the Qur'an. This research is creating an enabling computational environment for text mining the Qur'an, of which purpose is to facilitate people to understand each verse in the Qur'an. The classification experiment uses Support Vector Machine (SVM), Naive Bayes, k-Nearest Neighbor (kNN), and J48 Decision Tree classifier algorithms with Al-Baqarah verses translated to English and Indonesian as the dataset which was labeled by three most fundamental aspects of Islam: 'Iman' (faith), 'Ibadah' (worship), and 'Akhlaq' (virtues). Indonesian translation was processed by using the sastrawi package in Python to do the pre-processing and StringToWord Vector in WEKA with the TF-IDF method to implement the algorithms. The classification experiments are determined to measure accuracy, and f-measure, it tested with a percentage split 66% as the data training and the rest as the data testing. The decision from an experiment that was carried out by the classification results, SVM classifier algorithms have the overall best accuracy performance for the Indonesian translation of 81.443% and the Naïve Bayes classifier has the best accuracy for the English translation, which achieved 78.35%.

PDF

References

M. Osman, A. Hilal, and M. Alhawarat, “Fine-Grained Quran Dataset,” Int. J. Adv. Comput. Sci. Appl., vol. 6, no. 12, 2016.

V. Gupta and G. S. Lehal, “A survey of text mining techniques and applications,” J. Emerg. Technol. Web Intell., vol. 1, no. 1, pp. 60–76, 2009.

G. S. Hassan, S. K. Mohammad, and F. M. Alwan, “Categorization of ‘Holy Quran-Tafseer’ using K-Nearest Neighbor Algorithm,” Int. J. Comput. Appl., vol. 129, no. 12, pp. 1–6, 2015.

M. I. Rahman, N. A. Samsudin, A. Mustapha, and A. Abdullahi, “Comparative analysis for topic classification in Juz Al-Baqarah,” Indones. J. Electr. Eng. Comput. Sci., vol. 12, no. 1, pp. 406–411, 2018.

Mohammed N. Al-Kabi, Belal M. Abu Ata, Heider A. Wahsheh, and Izzat M. Alsmadi, “A Topical Classification of Quranic Arabic Text,” Proc. 2013 Taibah Univ. Int. Conf. Adv. Inf. Technol. Holy Quran Its Sci., no. December, pp. 272–277, 2013.

S. K. Hamed and M. J. A. Aziz, “A question answering system on Holy Quran translation based on question expansion technique and Neural Network classification,” J. Comput. Sci., vol. 12, no. 3, pp. 169–177, 2016.

C. Slamet, A. Rahman, M. A. Ramdhani, and W. Dharmalaksana, "Clustering the verses of the Holy Qur'an using K-means algorithm," Asian J. Inf. Technol., vol. 15, no. 24, pp. 5159–5162, 2016.

M. K. Siddiqui, S. Naahid, and M. N. I. Khan, “A REVIEW of QURANIC WEB PORTALS THROUGH DATA MINING,” VAWKUM Trans. Comput. Sci., vol. 5, no. 2, pp. 1–7, 2015.

A. Hilal and N. Srinivas, “Analytical of the Initial Holy Quran Letters Based on Data Mining Study,” Am. Int. J. Res. Formal, Appl. Nat. Sci., vol. 10, no. 1, pp. 1–8, 2015.

M. Akour, I. Alsmadi, and I. Alazzam, “MQVC: Measuring quranic verses similarity and sura classification using N-gram,” WSEAS Trans. Comput., vol. 13, pp. 485–491, 2014.

N. S. Jamil et al., “A subject identification method based on term frequency technique,” Int. J. Adv. Comput. Res., vol. 7, no. 30, pp. 103–110, 2017.

M. Alhawarat, “Extracting Topics from the Holy Quran Using Generative Models,” Int. J. Adv. Comput. Sci. Appl., vol. 6, no. 12, pp. 288–294, 2016.

M. N. Al-Kabi, H. A. Wahsheh, I. M. Alsmadi, and A. Moh’d Ali Al-Akhras, “Extended Topical Classification of Hadith Arabic Text,” Int. J. Islam. Appl. Comput. Sci. Technol., vol. 3, no. 3, pp. 13–23, 2015.

S. Vijayarani, J. Ilamathi, and Nithya, “Preprocessing Techniques for Text Mining - An Overview,” Int. J. Comput. Sci. Commun. Networks, vol. 5, no. 1, pp. 7–16, 2018.

F. Z. Tala, “A Study of Stemming Effect on Information Retrieval in Bahasa Indonesia,” 2003.

S. Amarappa and S. V Sathyanarayana, “Data Classification Using Support Vector Machine (SVM), a simplified approach,” Int. J. Electron. Comput. Sci. Eng., vol. 3, no. 4, pp. 435–445, 2019.

H. Motoda et al., Top 10 algorithms in data mining, vol. 14, no. 1. 2007.

Wikipedia Contributor, “C4.5 algorithm,” Wikipedia, The Free Encyclopedia, 2019. [Online]. Available: https://en.wikipedia.org/w/index.php?title=C4.5_algorithm&oldid=883549387.

A. O. Adeleke, N. A. Samsudin, A. Mustapha, and N. Nawi, “Comparative Analysis of Text Classification Algorithms for Automated Labelling of Quranic Verses,” Int. J. Adv. Sci. Eng. Inf. Technol., vol. 7, no. 4, p. 1419, 2017.

D. Kuhlman, A Python Book: Beginning Python, Advanced Python, and Python Exercises. Platypus Global Media, 2012.

Jubilee Digital, Pemrograman Python Untuk Pemula. Yogyakarta: Jubilee Solusi Enterprise, 2016.

I. H. Witten, E. Frank, and M. a Hall, Data Mining: Practical Machine Learning Tools and Techniques, 2nd ed. San Francisco: Cerra, Diane, 2011.

IJID (International Journal on Informatics for Development) is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

Comparative Analysis of Text Mining Classification Algorithms for English and Indonesian Qur’an Translation

Keywords

How to Cite

Download Citation

Abstract

References