First Token Algorithm for Searching Compound Terms Using Thesaurus Database
Abstract
Problem statement: Searching text materials is the one of the most important operations that carried out by search engines either on web or desktop applications, searching algorithms are required sometimes to find a specific word into a text, others to find a multi word term (pattern matching) into a text. Searching for term into a thesaurus database can be carried out using many searching algorithm such as brute-force algorithm and others. Approach: We addressed several issues concerning developing a searching algorithm that search terms into thesaurus database. Two exact algorithms were discussed and compared. The first algorithm, brute-force algorithm and the second one were proposed by this study to enhance brute-force algorithm. Results: We proposed an efficient search algorithm and compare it with brute force technique. Computational results showed that our algorithm can provide an efficient search algorithm that reduces the number of queries and the total time required to finish the required task. Conclusion: Our study showed an optimum solution for larger size of the studied problem with much less processing time than the brute-force algorithm. The modified algorithm has a higher efficiency to deal with Thesaurus Database searching problems.
DOI: https://doi.org/10.3844/jcssp.2012.61.67
Copyright: © 2012 Yousef Abuzir and Thabit Sabbah. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 4,051 Views
- 2,994 Downloads
- 2 Citations
Download
Keywords
- Brute-force
- pattern matching
- information retrieval
- compound terms searching
- First Token (FT)
- thesaurus database
- training thesaurus