Research Article Open Access

Big Data Framework for Predicting Infectious Diseases to Improve Healthcare by Discovering New Symptom Patterns

Amal Mohamed Mounir1, Mohamed Ibrahim Marie1 and Laila Abd-Elhamid1
  • 1 Department of Information Systems, Faculty of Computers and Artificial Intelligence, Helwan University, Egypt

Abstract

The utilization of big data in infectious disease control represents a captivating opportunity, as these novel data streams offer the potential to enhance the timeliness of preventive measures. Various healthcare providers in both the public and private sectors generate, store, and analyse extensive datasets to enhance the quality of services they deliver. Recently, the outbreak of the new coronavirus, COVID-19, has posed significant threats to human health, life, production, social connections, and international relations, placing them in substantial peril. Consequently, the adoption of big data technologies has played a pivotal role in the response to the pandemic. Infectious diseases manifest when a person contracts a disease from a pathogen transmitted by another person, posing challenges that affect both individual and macroscales. Furthermore, the unknown patterns of infectious illnesses add complexity to the prediction process. This study aims to establish a big data framework for predicting infectious diseases by uncovering new patterns of symptoms, ultimately enhancing healthcare infection prevention and control. To achieve this objective, machine-learning algorithms such as K-Nearest Neighbors and Random Forest were employed for cleaning and maintaining extensive datasets collected from December 2019 to June 2020. Additionally, FP-growth and the Park, Chen, and Yu algorithms were applied to identify new patterns. The results demonstrated the superior performance of the Support Vector Machines (SVM) classifier, which achieved the highest accuracy of 98.2%. The Random Forest (RF) classifier had the highest precision (92.80%), and the SVM classifier had the highest F1 score (94.80%). Similarly, the Park, Chen, and Yu algorithm outperformed FP growth, achieving an accuracy rate of 98.5%. These findings underscore the potential of big data and machine learning in pattern recognition and predicting infectious diseases, ultimately contributing to improved public health outcomes.

Journal of Computer Science
Volume 20 No. 10, 2024, 1251-1262

DOI: https://doi.org/10.3844/jcssp.2024.1251.1262

Submitted On: 20 February 2024 Published On: 2 August 2024

How to Cite: Mounir, A. M., Marie, M. I. & Abd-Elhamid, L. (2024). Big Data Framework for Predicting Infectious Diseases to Improve Healthcare by Discovering New Symptom Patterns. Journal of Computer Science, 20(10), 1251-1262. https://doi.org/10.3844/jcssp.2024.1251.1262

  • 742 Views
  • 366 Downloads
  • 0 Citations

Download

Keywords

  • Big Data
  • Healthcare
  • Association Rule Mining
  • Random Forest
  • Infection Diseases
  • PCY Algorithm