Vector space model, term frequency-inverse document frequency with linear search, and object-relational mapping Django on hadith data search

Ichsan Taufik, Agra Agra, Yana Aditia Gerhana

Abstract


For Muslims, the Hadith ranks as the secondary legal authority following the Quran. This research leverages hadith data to streamline the search process within the nine imams’ compendium using the vector space model (VSM) approach. The primary objective of this research is to enhance the efficiency and effectiveness of the search process within Hadith collections by implementing pre-filtering techniques. This study aims to demonstrate the potential of linear search and Django object-relational mapping (ORM) filters in reducing search times and improving retrieval performance, thereby facilitating quicker and more accurate access to relevant Hadiths. Prior studies have indicated that VSM is efficient for large data sets because it assigns weights to every term across all documents, regardless of whether they include the search keywords. Consequently, the more documents there are, the more protracted the weighting phase becomes. To address this, the current research pre-filters documents prior to weighting, utilizing linear search and Django ORM as filters. Testing on 62,169 hadiths with 20 keywords revealed that the average VSM search duration was 51 seconds. However, with the implementation of linear and Django ORM filters, the times were reduced to 7.93 and 8.41 seconds, respectively. The recall@10 rates were 79% and 78.5%, with MAP scores of 0.819 and 0.814, accordingly.

Keywords


Hadith 9 imams; Linear search; Orm django; Search; Vector space model

Full Text:

PDF


DOI: https://doi.org/10.11591/csit.v5i3.p306-314

Refbacks

  • There are currently no refbacks.


Computer Science and Information Technologies
p-ISSN: 2722-323X, e-ISSN: 2722-3221
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Universitas Ahmad Dahlan (UAD).

CSIT Visitor Stats

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.