Hybrid Attention-Based Stacked Bi-LSTM Model for Automated MultiImage Captioning
- 1 Department of Computer Science and Engineering, School of Computing Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Chennai, Tamil Nadu, India
Abstract
In recent days, the process of medical image captioning is become a prominent field. The distinct characteristics of medical imaging data provide a number of challenges when captioning medical images. Also, the variability in image modalities makes it difficult to generate an effective captioning process. Thus, the proposed study aims to design a novel Multi-image Captioning Hybrid Attention Model to afford effective automated medical image captioning with minimum medical errors. Image acquisition is the initial stage of acquiring input images from the specified dataset. Then, data augmentation is accomplished to maximize the dataset's size. After that, preprocessing is performed to enhance the quality of inputs through Improved Wiener Filtering (IWF), image resizing and color channel conversion. Next, the necessary features are extracted and bounding boxes are generated by utilizing a new Position Attentional YOLOv5 (PA-YOLOV5) approach. Subsequently, the captioning process is performed through the proposed innovative Attention-based Stacked Bi-directional Long-ShortTerm capsule network (A-SBiLSTCN) model. To enhance the efficiency of the proposed model, its hyper-parameters are finetuned by using the Chaotic Flamingo Search Optimization (CFSO) algorithm during the training stage. For experimentation, the Python platform is used, and the simulation is performed using the PEIR dataset. The proposed study outperformed other existing methods in terms of BLEU score (92.87%), METEOR score (88.20%), ROUGE-L score (73.20%), SPICE score (70.76%) and RIBES score (60.40%).
DOI: https://doi.org/10.3844/jcssp.2025.883.904
Copyright: © 2025 Paspula Ravinder and Saravanan Srinivasan. This is an open access article distributed under the terms of the
Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 50 Views
- 19 Downloads
- 0 Citations
Download
Keywords
- Medical Image Captioning
- Hybrid Attention
- Color Channel
- YOLOv5
- Optimization
- Hyperparameter Tuning
- Bleu Score