MedDiffuseMix: Preserving Diagnostic Evidence with Saliency-Aware Diffusion Medical Image Data Augmentatio
Abstract
Limited data availability, class imbalance, and domain variability remain major barriers to reliable medical image classification. Conventional augmentation can improve training diversity but may distort diagnostically informative structures, whereas unconstrained generative augmentation may introduce label-inconsistent content. This paper proposes MedDiffuseMix, a saliency-guided diffusion mixing framework for controlled medical image augmentation. The method uses classifier-derived saliency maps to separate high-saliency diagnostic regions from low-saliency background areas and applies diffusion-guided mixing mainly to regions with lower diagnostic importance. Adaptive mixing, Gaussian boundary blending, and a saliency-preservation constraint reduce semantic distortion and reject or attenuate samples that shift model attention away from clinically relevant evidence. The framework is evaluated on four public benchmarks: the Radiological Society of North America pneumonia chest radiography dataset, Musculoskeletal Radiographs, PatchCamelyon, and the Breast Cancer Histopathological Image Classification dataset. Experiments with convolutional and transformer-based classifiers show that MedDiffuseMix improves accuracy, F1-score, and area under the receiver operating characteristic curve compared with standard augmentation, Mixup, GenMix, SaliencyMix, and diffusion-based augmentation baselines. Ablation studies confirm the importance of saliency guidance, adaptive region mixing, and smooth boundary blending. Visual attribution analysis further indicates that MedDiffuseMix better preserves diagnostically salient regions. These results suggest that saliency-guided diffusion mixing is an effective augmentation strategy for limited-data medical image classification.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.