Research Article Open Access

Sentiment Analysis of Arabic Reviews for Saudi Hotels Using Unsupervised Machine Learning

Samar Alosaimi1, Maram Alharthi1, Khloud Alghamdi1, Tahani Alsubait1 and Tahani Alqurashi1
  • 1 Umm Al-Qura University, Saudi Arabia

Abstract

Virtual worlds such as social networking sites, blogs and content communities are extremely becoming one of the most powerful sources for news, markets, industries etc. These virtual worlds can be used for many aspects, because they are rich platforms full of feedback, emotions, thoughts and reviews. The main objective of this paper is to cluster Arabic reviews of Saudi hotels for sentiment analysis into positive and negative clusters. We used web scraping to collect Arabic reviews associated only with Saudi hotels, from the tourism website TripAdvisor and obtained in total 4604 Arabic reviews. Then the TF-IDF was applied to extract relevant features. An unsupervised learning approach was applied, in particular K-means and Hierarchical algorithms with two distance metrics: Cosine and Euclidean. Our manual labelled test data shows that the K-means algorithm with cosine distance performed well when applying all of our prepossessing steps. We concluded that the suggested prepossessing steps play a critical role in Arabic language processing and sentiment analysis.

Journal of Computer Science
Volume 16 No. 9, 2020, 1258-1267

DOI: https://doi.org/10.3844/jcssp.2020.1258.1267

Submitted On: 27 July 2020 Published On: 25 September 2020

How to Cite: Alosaimi, S., Alharthi, M., Alghamdi, K., Alsubait, T. & Alqurashi, T. (2020). Sentiment Analysis of Arabic Reviews for Saudi Hotels Using Unsupervised Machine Learning. Journal of Computer Science, 16(9), 1258-1267. https://doi.org/10.3844/jcssp.2020.1258.1267

  • 3,362 Views
  • 2,013 Downloads
  • 3 Citations

Download

Keywords

  • Unsupervised
  • Clustering
  • Hotel Reviews
  • Arabic Sentiment Analysis