Research Article Open Access

Rule Based Shallow Parser for Arabic Language

Mona Ali Mohammed and Nazlia Omar

Abstract

Problem statement: One of language processing approaches that compute a basic analysis of sentence structure rather than attempting full syntactic analysis is shallow syntactic parsing. It is an analysis of a sentence which identifies the constituents (noun groups, verb groups, prepositional groups), but does not specify their internal structure, nor their role in the main sentence. The only technique used for Arabic shallow parser is Support Vector Machine (SVM) based approach. The problem faced by shallow parser developers is the boundary identification which is applied to ensure the generation of high accuracy system performance. Approach: The specific objective of the research was to identify the entire Noun Phrases (NPs), Verb Phrases (VPs) and Prepositional Phrases (PPs) boundaries in the Arabic language. This study discussed various idiosyncrasies of Arabic sentences to derive more accurate rules to detect start and the end boundaries of each clause in an Arabic sentence. New rules were proposed to the shallow parser features up to the generation of two levels from full parse-tree. We described an implementation and evaluate the rule-based shallow parser that handles chunking of Arabic sentences. This research was based on a critical analysis of the Arabic sentences architecture. It discussed various idiosyncrasies of Arabic sentences to derive more accurate rules to detect the start and the end boundaries of each clause in an Arabic sentence. Results: The system was tested manually on 70 Arabic sentences which composed of 1776 words, with the length of the sentences between 4-50 words. The result obtained was significantly better than state of the art Arabic published results, which achieved F-scores of 97%. Conclusion: The main achievement includes the development of Arabic shallow parser based on rule-based approaches. Chunking which constitutes the main contribution is achieved on two successive stages that include grouped sequences of adjacent words on the basis of linguistic properties.

Journal of Computer Science
Volume 7 No. 10, 2011, 1505-1514

DOI: https://doi.org/10.3844/jcssp.2011.1505.1514

Submitted On: 17 May 2011 Published On: 4 August 2011

How to Cite: Mohammed, M. A. & Omar, N. (2011). Rule Based Shallow Parser for Arabic Language. Journal of Computer Science, 7(10), 1505-1514. https://doi.org/10.3844/jcssp.2011.1505.1514

  • 3,499 Views
  • 8,353 Downloads
  • 10 Citations

Download

Keywords

  • Arabic shallow parsing
  • rule based approaches
  • text chunking
  • Arabic language processing
  • Arabic language phrases
  • Natural Language Processing (NLP)
  • hand shallow
  • Part Of Speech (POS)