Research Article Open Access

Email Spam Classification Using Gated Recurrent Unit and Long Short-Term Memory

Iqbal Basyar1, Adiwijaya1 and Danang Triantoro Murdiansyah1
  • 1 Telkom University, Indonesia

Abstract

High numbers of spam emails have led to an increase in email triage, causing losses amounting to USD 355 million per year. One way to reduce this loss is to classify spam email into categories including fraud or promotions made by unwanted parties. The initial development of spam email classification was based on simple methods such as word filters. Now, more complex methods have emerged such as sentence modeling using machine learning. Some of the most well-known methods for dealing with the problem of text classification are networks with Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU). This study focuses on the classification of spam emails, so both the LTSM and GRU methods were used. The results of this study show that, under the scenario without dropout, the LSTM and GRU obtained the same accuracy value of 0.990183, superior to XGBoost, the base model. Meanwhile, in the dropout scenario, LSTM outperformed GRU and XGboost with each obtaining an accuracy of 98.60%, 98.58% and 98.52%, respectively. The GRU recall score was better than that of LSTM and XGBoost in the scenario with dropouts, each obtaining values of 98.98%, 98.92% and 98.15% respectively. In the scenario without dropouts, LSTM was superior to GRU and XGBoost, with each obtaining values of 98.39%, 98.39% and 98.15% respectively.

Journal of Computer Science
Volume 16 No. 4, 2020, 559-567

DOI: https://doi.org/10.3844/jcssp.2020.559.567

Submitted On: 9 August 2019 Published On: 3 April 2020

How to Cite: Basyar, I., Adiwijaya, . & Murdiansyah, D. T. (2020). Email Spam Classification Using Gated Recurrent Unit and Long Short-Term Memory. Journal of Computer Science, 16(4), 559-567. https://doi.org/10.3844/jcssp.2020.559.567

  • 4,335 Views
  • 2,052 Downloads
  • 12 Citations

Download

Keywords

  • GRU
  • LSTM
  • Spam Classification