Exploiting Surrounding Text for Retrieving Web Images
Abstract
Web documents contain useful textual information that can be exploited for describing images. Research had been focused on representing images by means of its content (low level) description such as color, shape and texture, little research had been directed to exploiting such textual information. The aim of this research was to systematically exploit the textual content of HTML documents for automatically indexing and ranking of images embedded in web documents. A heuristic approach for locating and assigning weight surrounding web images and a modified tf.idf weighting scheme was proposed. Precision-recall measures of evaluation had been conducted for ten queries and promising results had been achieved. The proposed approach showed slightly better precision measure as compared to a popular search engine with an average of 0.63 and 0.55 relative precision measures respectively.
DOI: https://doi.org/10.3844/jcssp.2008.842.846
Copyright: © 2008 S. A. Noah, A. Azilawati, T. M.T. Sembok and T. W.T.S. Meriam. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 3,138 Views
- 2,128 Downloads
- 7 Citations
Download
Keywords
- Information retrieval
- image retrieval
- precision recall