A Comparative and Hybrid Study of CNN and Transformer Models for Multi-Class Virus Classification in Transmission Electron Microscopy
Abstract
The automatic recognition of virus particles in transmission electron microscopy (TEM) images remains a demanding task, primarily owing to strong inter-class similarity, scale variability, and pronounced class imbalance. In this study, several convolutional neural networks and transformer-based architectures were comparatively evaluated for the classification of 22 virus categories using the TEM virus dataset. All models were trained under identical preprocessing and optimization conditions, and imbalance effects were mitigated through a weighted cross-entropy formulation. Performance was quantified using overall accuracy together with macro-averaged precision, recall, and F1 score. Among standalone models, the Swin Transformer achieved the highest accuracy (0.8831) and macro-F1 score (0.8444), followed by DeiT (accuracy 0.8669). Convolutional architectures exhibited comparatively lower balanced performance, with ResNet50 demonstrating substantial degradation (accuracy 0.5887) under imbalanced conditions. To exploit complementary representational properties, decision-level hybrid strategies were implemented. The performance-weighted hybrid attained an accuracy of 0.8831 and the highest macro-F1 score (0.8528), slightly surpassing the equal-weight hybrid configuration. These observations indicate that architectural heterogeneity contributes to improved inter-class balance without sacrificing overall predictive accuracy. Future work may explore scale-aware representations, feature-level fusion mechanisms, and expanded TEM datasets to further enhance robustness and generalization in virus identification tasks.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.