Normalized Web Distance Based Web Query Classification
Abstract
Problem statement: The problem is to classify a given web query to a set of 67 target categories. The target categories are ranked based on the degree of similarity to a given query. Approach: The feature set is the set of intermediate categories retrieved from a directory search engine for a given query. Using direct mapping and Normalized Web Distance (NWD) the intermediate categories are mapped to the required target categories. The categories are then ranked based on three parameters of the intermediate categories namely, position, frequency and a combination of frequency and position. Results: The results proved that the third parameter gave a better result and a maximum of 40 search result pages ensure better results. Conclusion: With NWD as the similarity measure, the precision and recall is found to increase by 10% over the previous methods.
DOI: https://doi.org/10.3844/jcssp.2012.804.808
Copyright: © 2012 S. Lovelyn Rose and K. R. Chandran. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 3,034 Views
- 2,517 Downloads
- 1 Citations
Download
Keywords
- Automatic web query classification
- directory search
- query log
- NWD