Optical Character Recognition System for Arabic Text Using Cursive Multi-Directional Approach
Abstract
This paper presents a novel new technique based on feature extraction and on dynamic cursor sizing for the recognition of Arabic Text. The most challenging area in Arabic OCR (AOCR) research is the segmentation of words into their sub-words and their individual characters. Several rules are defined that govern the size and movement of the cursor through each segment. The features obtained from each segment are termed strokes and each segment is defined by a number of strokes where each stroke is defined mainly in terms of a sequence of directions. The basic concept followed here is a logical, dynamically sized cursor that is used to "travel" through a text image of one word at a time while extracting features of strokes. The strokes obtained are then "pieced" back together to be classified into character classes based on a knowledge base and eventual recognition of characters is achieved. The results demonstrate that the technique is successful.
DOI: https://doi.org/10.3844/jcssp.2007.549.555
Copyright: © 2007 Mansoor Al-A'ali and Jamil Ahmad. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 3,111 Views
- 3,116 Downloads
- 16 Citations
Download
Keywords
- Arabic
- OCR
- features
- strokes
- segments