Integrating Retrieval-Augmented Generation with Large Language Model Mistral 7b for Indonesian Medical Herb

Diash Firdaus; Idi Sumardi; Yuni Kulsum

doi:10.14421/jiska.2024.9.3.230-243

Authors

Diash Firdaus Institut Teknologi Nasional
Idi Sumardi STMIK Jawa Barat
Yuni Kulsum UIN Sunan Gunung Djati

DOI:

https://doi.org/10.14421/jiska.2024.9.3.230-243

Keywords:

LLM, Generative AI, LLAMA2, Retrieval-Augmented Generation, Deep Learning

Abstract

Large Language Models (LLMs) are advanced artificial intelligence systems that use deep learning, particularly transformer architectures, to process and generate text. One such model, Mistral 7b, featuring 7 billion parameters, is optimized for high performance and efficiency in natural language processing tasks. It outperforms similar models, such as LLaMa2 7b and LLaMa 1, across various benchmarks, especially in reasoning, mathematics, and coding. LLMs have also demonstrated significant advancements in addressing medical queries. This research leverages Indonesia’s rich biodiversity, which includes approximately 9,600 medicinal plant species out of the 30,000 known species. The study is motivated by the observation that LLMs, like ChatGPT and Gemini, often rely on internet data of uncertain validity and frequently provide generic answers without mentioning specific herbal plants found in Indonesia. To address this, the dataset for pre-training the model is derived from academic journals focusing on Indonesian medicinal herbal plants. The research process involves collecting these journals, preprocessing them using Langchain, embedding models with sentence transformers, and employing Faiss CPU for efficient searching and similarity matching. Subsequently, the Retrieval-Augmented Generation (RAG) process is applied to Mistral 7b, allowing it to provide accurate, dataset-driven responses to user queries. The model's performance is evaluated using both human evaluation and ROUGE metrics, which assess recall, precision, F1 measure, and METEOR scores. The results show that the RAG Mistral 7b model achieved a METEOR score of 0.22%, outperforming the LLaMa2 7b model, which scored 0.14%.

References

Ardiyanto, D., Triyono, A., Nisa, U., Fitriani, U., Astana, P. R., Novianto, F., & Zulkarnain, Z. (2021). The use of hyperuricemia herbs at “Hortus Medicus” herbal medicine clinic Tawangmangu. Jurnal Kedokteran Dan Kesehatan Indonesia. https://doi.org/10.20885/JKKI.Vol12.Iss2.art9

Arozal, W., Louisa, M., & Soetikno, V. (2020). Selected Indonesian Medicinal Plants for the Management of Metabolic Syndrome: Molecular Basis and Recent Studies. Frontiers in Cardiovascular Medicine, 7. https://doi.org/10.3389/fcvm.2020.00082

Chung, H. W., Hou, L., Longpre, S., Zoph, B., Tay, Y., Fedus, W., Li, Y., Wang, X., Dehghani, M., Brahma, S., Webson, A., Gu, S. S., Dai, Z., Suzgun, M., Chen, X., Chowdhery, A., Castro-Ros, A., Pellat, M., Robinson, K., … Wei, J. (2022). Scaling Instruction-Finetuned Language Models: Vol. 1?54 (H. W. Chung, S. Longpre, B. Zoph, A. Castro-ros, A. Yu, & A. Dai, Eds.). http://arxiv.org/abs/2210.11416

Elfahmi, Woerdenbag, H. J., & Kayser, O. (2014). Jamu: Indonesian traditional herbal medicine towards rational phytopharmacological use. Journal of Herbal Medicine, 4(2), 51–73. https://doi.org/10.1016/j.hermed.2014.01.002

Fathir, A., HAIKAL, MOCH., & Wahyudi, D. (2021). Ethnobotanical study of medicinal plants used for maintaining stamina in Madura ethnic, East Java, Indonesia. Biodiversitas Journal of Biological Diversity, 22(1), 386–392. https://doi.org/10.13057/biodiv/d220147

Geberemeskel, G. A., Debebe, Y. G., & Nguse, N. A. (2019). Antidiabetic Effect of Fenugreek Seed Powder Solution ( Trigonella foenum-graecum L .) on Hyperlipidemia in Diabetic Patients. Journal of Diabetes Research, 2019, 1–8. https://doi.org/10.1155/2019/8507453

Hadi, M. U., Al-tashi, Q., Qureshi, R., Shah, A., Muneer, A., Irfan, M., Zafar, A., Shaikh, M. B., Akhtar, N., Hassan, S. Z., Shoman, M., Wu, J., Mirjalili, S., & Shah, M. (2024). Large Language Models: A Comprehensive Survey of its Applications, Challenges, Limitations, and Future Prospects. TechRxiv, 1–47. https://doi.org/10.36227/techrxiv.23589741.v2

Jain, N., Saifullah, K., Wen, Y., Kirchenbauer, J., Shu, M., Saha, A., Goldblum, M., Geiping, J., & Goldstein, T. (2023). Bring Your Own Data! Self-Supervised Evaluation for Large Language Models. http://arxiv.org/abs/2306.13651

Jiang, A. Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D. S., Casas, D. de las, Bressand, F., Lengyel, G., Lample, G., Saulnier, L., Lavaud, L. R., Lachaux, M.-A., Stock, P., Scao, T. Le, Lavril, T., Wang, T., Lacroix, T., & Sayed, W. El. (2023). Mistral 7B: Vol. 7b.? 1?9. http://arxiv.org/abs/2310.06825

Kaddour, J., Harris, J., Mozes, M., Bradley, H., Raileanu, R., & McHardy, R. (2023). Challenges and Applications of Large Language Models. http://arxiv.org/abs/2307.10169

Kartini, K., Jayani, N. I. E., Octaviyanti, N. D., Krisnawan, A. H., & Avanti, C. (2019). Standardization of Some Indonesian Medicinal Plants Used in “Scientific Jamu.” IOP Conference Series: Earth and Environmental Science, 391(1), 012042. https://doi.org/10.1088/1755-1315/391/1/012042

Luo, R., Sun, L., Xia, Y., Qin, T., Zhang, S., Poon, H., & Liu, T.-Y. (2022). BioGPT: generative pre-trained transformer for biomedical text generation and mining. Briefings in Bioinformatics, 23(6). https://doi.org/10.1093/bib/bbac409

OpenAI, Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F. L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., Avila, R., Babuschkin, I., Balaji, S., Balcom, V., Baltescu, P., Bao, H., Bavarian, M., Belgum, J., … Zoph, B. (2023). GPT-4 Technical Report. 4, 1–100. http://arxiv.org/abs/2303.08774

Putri, L. S. E., Dasumiati, D., Kristiyanto, K., Mardiansyah, M., Malik, C., Leuvinadrie, L. P., & Mulyono, E. A. (1970). Ethnobotanical study of herbal medicine in Ranggawulung Urban Forest, Subang District, West Java, Indonesia. Biodiversitas Journal of Biological Diversity, 17(1), 172–176. https://doi.org/10.13057/biodiv/d170125

Radeva, I., Popchev, I., Doukovska, L., & Dimitrova, M. (2024). Web Application for Retrieval-Augmented Generation: Implementation and Testing. Electronics, 13(7), 1361. https://doi.org/10.3390/electronics13071361

Ray, P. P. (2023). ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems, 3, 121–154. https://doi.org/10.1016/j.iotcps.2023.04.003

Ren, X., Zhou, P., Meng, X., Huang, X., Wang, Y., Wang, W., Li, P., Zhang, X., Podolskiy, A., Arshinov, G., Bout, A., Piontkovskaya, I., Wei, J., Jiang, X., Su, T., Liu, Q., & Yao, J. (2023). PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing. http://arxiv.org/abs/2303.10845

Sholikhah, E. N. (2016). Indonesian medicinal plants as sources of secondary metabolites for pharmaceutical industry. Journal of the Medical Sciences (Berkala Ilmu Kedokteran), 48(04), 226–239. https://doi.org/10.19106/JMedSci004804201606

Sianipar, E. A. (2021). The Potential of Indonesian Traditional Herbal Medicine as Immunomodulatory Agents: A Review. International Journal of Pharmaceutical Sciences and Research, 12(10), 5229–5237. https://doi.org/10.13040/IJPSR.0975-8232.12(10).5229-37

Singhal, K., Azizi, S., Tu, T., Mahdavi, S. S., Wei, J., Chung, H. W., Scales, N., Tanwani, A., Cole-Lewis, H., Pfohl, S., Payne, P., Seneviratne, M., Gamble, P., Kelly, C., Babiker, A., Schärli, N., Chowdhery, A., Mansfield, P., Demner-Fushman, D., … Natarajan, V. (2023). Large language models encode clinical knowledge. Nature, 620(7972), 172–180. https://doi.org/10.1038/s41586-023-06291-2

Singhal, K., Tu, T., Gottweis, J., Sayres, R., Wulczyn, E., Hou, L., Clark, K., Pfohl, S., Cole-Lewis, H., Neal, D., Schaekermann, M., Wang, A., Amin, M., Lachgar, S., Mansfield, P., Prakash, S., Green, B., Dominowska, E., Arcas, B. A. y, … Natarajan, V. (2023). Towards Expert-Level Medical Question Answering with Large Language Models. http://arxiv.org/abs/2305.09617

Sumarni, W., Sudarmin, S., & Sumarti, S. S. (2019). The scientification of jamu: a study of Indonesian’s traditional medicine. Journal of Physics: Conference Series, 1321(3), 032057. https://doi.org/10.1088/1742-6596/1321/3/032057

Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozière, B., Goyal, N., Hambro, E., Azhar, F., Rodriguez, A., Joulin, A., Grave, E., & Lample, G. (2023). LLaMA: Open and Efficient Foundation Language Models. http://arxiv.org/abs/2302.13971

Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., Bikel, D., Blecher, L., Ferrer, C. C., Chen, M., Cucurull, G., Esiobu, D., Fernandes, J., Fu, J., Fu, W., … Scialom, T. (2023). Llama 2: Open Foundation and Fine-Tuned Chat Models. http://arxiv.org/abs/2307.09288

Wang, T., Yu, P., Tan, X. E., O’Brien, S., Pasunuru, R., Dwivedi-Yu, J., Golovneva, O., Zettlemoyer, L., Fazel-Zarandi, M., & Celikyilmaz, A. (2023). Shepherd: A Critic for Language Model Generation. https://arxiv.org/abs/2308.04592v1

Zareie, A., Sahebkar, A., Khorvash, F., Bagherniya, M., Hasanzadeh, A., & Askari, G. (2020). Effect of cinnamon on migraine attacks and inflammatory markers: A randomized double‐blind placebo‐controlled trial. Phytotherapy Research, 34(11), 2945–2952. https://doi.org/10.1002/ptr.6721

Zhang, K., Zhou, R., Adhikarla, E., Yan, Z., Liu, Y., Yu, J., Liu, Z., Chen, X., Davison, B. D., Ren, H., Huang, J., Chen, C., Zhou, Y., Fu, S., Liu, W., Liu, T., Li, X., Chen, Y., He, L., … Sun, L. (2024). BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks. Nature Medicine. https://doi.org/10.1038/s41591-024-03185-2

Zhang, T., Huang, Z., Wang, Y., Wen, C., Peng, Y., & Ye, Y. (2022). Information Extraction from the Text Data on Traditional Chinese Medicine: A Review on Tasks, Challenges, and Methods from 2010 to 2021. Evidence-Based Complementary and Alternative Medicine, 2022, 1–19. https://doi.org/10.1155/2022/1679589

Zhu, X., Li, J., Liu, Y., Ma, C., & Wang, W. (2023). A Survey on Model Compression for Large Language Models. https://arxiv.org/abs/2308.07633v4