Research Article Open Access

A Comparative Analysis of the Entropy and Transition Point Approach in Representing Index Terms of Literary Text

Hayati Abd Rahman and Shahrul Azman Noah

Abstract

Problem statement: Concept hierarchy is a hierarchically organized collection of domain concepts. It is particularly useful in many applications such as information retrieval, document browsing and document classification. Approach: One of the important tasks in construction of concept hierarchy is identification of suitable terms with appropriate size of domain vocabulary. Results: One way of achieving such a size is by using term reduction. The aim of this study is to examine the effectiveness of reduction approach to reduce size of vocabulary using term selection methods for literary text. The experiment compares entropy method, transition point method and hybrid of transition point and entropy methods with the Vector Space Model (VSM). Conclusion/Recommendations: Results indicate the effectiveness of Transition Point method as compared to the others in reducing size of vocabulary but at same time preserve those important terms that exist in the literary documents.

Journal of Computer Science
Volume 7 No. 7, 2011, 1088-1093

DOI: https://doi.org/10.3844/jcssp.2011.1088.1093

Submitted On: 18 October 2010 Published On: 30 June 2011

How to Cite: Rahman, H. A. & Noah, S. A. (2011). A Comparative Analysis of the Entropy and Transition Point Approach in Representing Index Terms of Literary Text. Journal of Computer Science, 7(7), 1088-1093. https://doi.org/10.3844/jcssp.2011.1088.1093

  • 3,092 Views
  • 2,432 Downloads
  • 1 Citations

Download

Keywords

  • Information retrieval
  • term reduction
  • concept hierarchy
  • Dominating Set Problem (DSP)
  • Vector Space Model (VSM)
  • Transition Point (TP)