Research Article Open Access

Adaptive Synthetic Oversampling Algorithm for Handling Class Imbalance in Multi-Class Data Stream Classification

Priya S.1 and Annie Uthra2
  • 1 Department of Computer Science and Engineering, SRM Institute of Science and Technology, India
  • 2 Department of Computational Intelligence, SRM Institute of Science and Technology, India

Abstract

Concept drift and class imbalanced data are major challenging processes involved in modern streaming data classification. Particularly, when integrated with difficult factors like the existence of noise, overlapping class distribution, concept drift, and data imbalance can considerably affect the classifier results. In addition, various challenges affect the performance of the existing oversampling schemes such as SMOTE and its derivatives. Regardless of that, several existing models concentrate on the data imbalance in the binary classification problems, whereas the complex multi-class counterparts are yet to be explored. With this motivation, this study develops an Adaptive Synthetic Oversampling Algorithm (ASYNO) based Multiclass Streaming Data Classification (ASYNO-MCSDC) model on Class Imbalance Handling and Concept Drift. The presented ASYNO-MCSDC method initially performs different stages of preprocessing such as label encoding, data normalization, and data splitting. Besides, the Adaptive Synthetic oversampling technique (ASYNO) is applied for handling class imbalance data problems. Also, the online bagging ensemble classifier is employed for the data classification process in which the Hoeffding Tree (HT) was utilized as the base classification and the number of estimators used in online bagging is set to 10. For the process of experimentation, two types of learning are used, one is batch learning and other is incremental learning. The experimental validation of the ASYNO-MCSDC model is tested using two datasets namely stationary imbalance stream and dynamic imbalance stream. The experimental results pointed out that the ASYNO-MCSDC model has accomplished promising results over other models.

Journal of Computer Science
Volume 18 No. 7, 2022, 650-664

DOI: https://doi.org/10.3844/jcssp.2022.650.664

Submitted On: 24 May 2022 Published On: 26 July 2022

How to Cite: S., P. & Uthra, A. (2022). Adaptive Synthetic Oversampling Algorithm for Handling Class Imbalance in Multi-Class Data Stream Classification. Journal of Computer Science, 18(7), 650-664. https://doi.org/10.3844/jcssp.2022.650.664

  • 1,838 Views
  • 761 Downloads
  • 1 Citations

Download

Keywords

  • Machine Learning
  • Class Imbalance
  • Concept Drift
  • Data Classification
  • Oversample
  • Streaming Data