Genome Sequence Analysis: A Survey
Abstract
Problem statement: Sequence analysis problems are NP hard and need optimal solutions. Interesting problems include duplicate sequence detection, sequence matching by relevance, sequence analysis using approximate comparison in general or using tools i.e., Matlab and multi-lingual sequence analysis. The usefulness of these operations is highlighted and future expectations are described. Approach: This study described the concepts, tools, methodologies, algorithms being used for sequence analysis. The sequences contained precious information that needed to be mined for useful purposes. There was high concentration required to model the optimal solution. The similarity and alignments concepts can not be addressed directly with one technique or algorithm, a better performance was achieved by the comprehension of different concepts. Results: We had compared different approaches using exemplary data and found that ClustalW2 is fairly good tool in terms of analysis. We assigned different weight values for relevant features and obtained score 95 in comparison phenomenon and 45 in alignment. Conclusion: Different techniques and approaches had been evaluated and compared.
DOI: https://doi.org/10.3844/jcssp.2009.651.660
Copyright: © 2009 Hassan Mathkour and Muneer Ahmad. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 3,250 Views
- 2,644 Downloads
- 4 Citations
Download
Keywords
- Genome
- multi-lingual
- approximate matching
- nucleotide base pair
- corpora
- duplicate sequences