
Medicine and Health
Advancing diagnostic performance and clinical usability of neural networks via adversarial training and dual batch normalization
T. Han, S. Nebelung, et al.
This innovative research conducted by Tianyu Han and colleagues reveals how adversarially trained neural networks can enhance pathology detection and clinical usability. By employing dual batch normalization, the study showcases improved interpretability of saliency maps, validated across diverse datasets, underscoring the demand for tailored training techniques in real-world imaging.
~3 min • Beginner • English
Introduction
The study addresses the vulnerability of conventionally trained deep learning models in medical imaging to adversarial perturbations and the limited interpretability of model decisions, which hinders clinical adoption. The authors hypothesize that adversarial training can both improve robustness to adversarial attacks and yield saliency maps that better align with clinically meaningful regions, enhancing clinical usability. They further posit that using dual batch normalization (separate normalization for real and adversarial samples) can mitigate the commonly reported accuracy drop associated with adversarial training, particularly when trained on sufficiently large datasets. The work is motivated by practical clinical scenarios where adversarial manipulation could bias automated decisions and by the need for reliable, interpretable model explanations in radiology.
Literature Review
Prior work demonstrated the success of CNNs in medical imaging tasks (e.g., lung cancer, retinal disease, skin lesions) and highlighted vulnerabilities to adversarial attacks, including imperceptible perturbations (one-pixel and PGD/FGSM-type attacks). Explainability techniques such as CAM and Grad-CAM, and gradient-based saliency methods have been used irrespective of training regime. Previous research found a trade-off between robustness and accuracy in adversarial training, often showing reduced accuracy for robust models. Studies also suggested that adversarial training can produce gradients that are sparser and more aligned with human perception and that robustness may require more data. Batch normalization has been implicated in optimization and representation smoothness; large-scale adversarially augmented training with separate normalization has improved accuracy in natural image tasks. This work builds upon these findings by applying adversarial training with dual batch norms to medical imaging and rigorously evaluating both diagnostic performance and clinician-rated interpretability.
Methodology
Datasets: Four medical imaging datasets were used. (1) CheXpert: 224,316 chest radiographs from 65,240 patients; 191,027 frontal radiographs used for training after label cleaning (uncertainty labels set to 1.0 except consolidation set to 0.0; labels not mentioned set to 0.0). Official internal validation/test set: 202 radiographs with consensus ground truth of three radiologists, with counts per class provided (e.g., 66 cardiomegaly, 42 edema, etc.). (2) NIH ChestX-ray8: 112,120 frontal radiographs from 30,805 patients; used as an external test set of 22,433 radiographs with label distributions given (e.g., 582 cardiomegaly, 413 edema, etc.). (3) Rijeka knee MRI dataset for ACL injury classification. (4) LUNA16 CT dataset with malignant tumor ROIs; tumor patches scaled to 64×64 pixels.
Models and training: ResNet-50 was used for all classification tasks. Optimization employed Adam (β1=0.9, β2=0.99, ε=1e−8), trained for up to 300 epochs with an initial learning rate of 0.01 decayed by 10× after 100 epochs. Development-based early stopping with sigmoid binary cross-entropy loss was used. Images from CheXpert, ChestX-ray8, and kneeMRI were resized to 256×256; LUNA16 ROI patches to 64×64. Data augmentation included random color transformations (contrast, brightness, saturation, hue), spatial affine transforms, and random cropping; inputs normalized to [0,1]. Computations used a multi-GPU setup (Nvidia Titan RTX), implemented in Python (NumPy, SciPy, PyTorch 1.1.0).
Adversarial training and dual batch normalization: Robust models were trained via projected gradient descent (PGD)-based adversarial training by minimizing expected adversarial loss: min_θ E_{(x,y)} max_{δ∈Δ} L(x+δ, y; θ), with Δ an l∞-bounded perturbation set. The study analyzed loss landscape smoothing and over-regularization effects associated with adversarial training. To mitigate performance loss, adversarial examples were treated as data augmentation while introducing separate batch normalization layers for real (BN_std) and adversarial (BN_adv) inputs (dual batch norm). Training objective combined losses on real and adversarial samples while updating shared convolutional parameters and separate BN statistics/parameters. Algorithm 1 details adversarial training with separate BN_d and BN_a using default ε=0.005, step size α=0.0025, k=10 PGD steps, and a parameter β=64 as specified by the authors. During training, adversarial examples were generated per batch with PGD, clipped within ε-bounds, and both real and adversarial batches were passed through their respective BN paths; the total loss was L_d + L_a.
Model interpretation and representation analysis: Saliency maps were generated by back-propagating loss gradients to input pixels, clipping values to ±3× standard deviation around the mean and normalizing to [−1,1]. Representation similarity across layers was quantified using linear centered kernel alignment (linear CKA) computed across all 202 CheXpert internal test radiographs, comparing models trained standardly, adversarially with a single BN, and adversarially with dual BN evaluated through the respective BN paths.
Reader study of interpretability: Six radiologists independently rated 100 randomly selected images per dataset (X-ray, MRI, CT; total 300 images) for how well saliency maps guided them to the correct pathology using a 0–5 scale (0=no correlation; 5=clear, unambiguous correlation). The three model types evaluated were: SSM (standard model saliency), SSBN (adversarially trained, single BN), and SDBN (adversarially trained, dual BN). Rating standards are defined in Table 2. An additional control experiment trained models using random Gaussian pixel-noise augmentation (σ=0.01 and σ=0.1) to compare saliency interpretability with adversarial augmentation.
Statistical analysis: Test-set metrics included ROC-AUC, sensitivity, and specificity. Threshold selection minimized (1−sensitivity)^2 + (1−specificity)^2. Bootstrap with 10,000 redraws estimated confidence intervals. Significance testing for metric differences used permutation with N=1000 bootstraps; p<0.001 was considered significant. Reader study used Friedman tests for overall differences across the three models and Wilcoxon signed-rank tests for pairwise comparisons when applicable.
External validation and data subsampling: Models trained on CheXpert were evaluated on the external ChestX-ray8 set (22,433 images). To assess data requirements, CheXpert training data were randomly subsampled to 1% (1,910 X-rays) and 10% (19,103 X-rays), analyzing AUC trends for standard, adversarial (single BN), and adversarial (dual BN) models.
Key Findings
- Robustness: Standard models trained on CheXpert were highly vulnerable to adversarial perturbations; ROC-AUC dropped sharply even for small ε (e.g., ε≤0.01). Adversarially trained models maintained substantially higher performance under PGD attacks (ε during training set to 0.005), demonstrating robustness.
- Accuracy vs robustness: With limited data (knee MRI and LUNA16), adversarial training with a single BN reduced accuracy compared to standard training. With a large dataset (CheXpert; 191,027 frontal radiographs), the performance gap narrowed; adversarially trained models with dual batch norms achieved ROC-AUC comparable to standard models across multiple pathologies. Summaries and CIs are reported in Supplementary Tables 3–5; no significant differences in ROC-AUC, sensitivity, or specificity between standard and dual-BN adversarial models on CheXpert.
- Dual batch normalization effect: Analytical expansion of the adversarial objective suggests adversarial training smooths the loss landscape (reducing Jacobian/Hessian magnitudes), potentially over-smoothing and limiting sensitivity to non-robust but useful features. Employing dual batch norms separates statistics for real and adversarial samples, preserving accuracy on real data while maintaining robustness. Linear CKA analyses showed that single-BN adversarial training increased long-range layer similarities (block-like patterns), indicating reduced network complexity; dual BN preserved complexity for real inputs, matching standard training patterns, while still adapting to adversarial inputs.
- Data requirements: Subsampling CheXpert to 1% and 10% showed that adding more data increased ROC-AUC for adversarially trained dual-BN models. For cardiomegaly, edema, and pneumothorax, single-BN adversarial models plateaued, whereas dual-BN models continued improving with more data. Pneumonia classification remained challenging due to low prevalence (2.4% positives in CheXpert).
- External validation: On ChestX-ray8 external test set (22,433 radiographs), dual-BN adversarial models outperformed single-BN adversarial models and performed comparably to standard models overall, though slightly worse than standard models on some pathologies (cardiomegaly, edema, atelectasis), likely due to distribution shift and higher sample complexity of robust learning.
- Interpretability: Six-radiologist ratings (0–5 scale) showed saliency maps from dual-BN adversarial models (SDBN) had significantly higher clinical utility than those from standard models (SSM) or single-BN adversarial models (SSBN): X-ray SSM 0.57±0.94, SSBN 2.20±1.33, SDBN 2.69±1.56 (Friedman p<0.001; exact p=1.0×10^-160); MRI SSM 0.49±0.74, SSBN 0.74±1.09, SDBN 2.17±1.57 (p=1.7×10^-118); CT SSM 0.47±0.73, SSBN 2.32±1.72, SDBN 2.50±1.74 (p=7.0×10^-150). Visual examples showed SDBN maps focusing on clinically relevant regions (e.g., organ shape, lung opacity, lesion borders). Control experiments with random noise augmentation yielded lower interpretability scores (σ=0.01: 1.14±1.23; σ=0.1: 1.35±1.37) than adversarial augmentation (2.14±1.43), indicating interpretability gains arise from adversarial training rather than generic noise augmentation.
Discussion
The findings support the hypothesis that adversarial training can produce models that are both robust to adversarial attacks and more clinically interpretable, with saliency maps aligning better to regions of diagnostic interest. The commonly observed accuracy–robustness trade-off can be mitigated using dual batch normalization and sufficient training data, enabling adversarially trained models to match standard models in ROC-AUC, sensitivity, and specificity on large-scale medical imaging tasks. Representation analyses suggest that single-BN adversarial training over-smooths the loss landscape and reduces network representational diversity, contributing to performance loss; dual BN decouples real and adversarial paths, preserving complexity and accuracy for real inputs while maintaining robustness for adversarial inputs. External validation confirms generalization benefits of dual BN, though slight performance gaps for some conditions highlight the influence of domain shift and the greater data requirements inherent to robust learning. Clinically, improved saliency map usefulness may enhance trust and adoption of AI in radiology, though the direct impact on diagnostic endpoints requires further investigation.
Conclusion
Adversarially trained neural networks with dual batch normalization achieve diagnostic performance comparable to standard models while providing superior robustness to adversarial attacks and substantially more clinically useful saliency maps. The dual BN approach preserves model complexity for real inputs and mitigates the accuracy degradation typically associated with adversarial training. External validation on 22,433 X-rays supports transferability. Future work should assess higher-dimensional (3D/4D) medical imaging models, where adversarial training is more challenging, and rigorously evaluate whether improved interpretability translates to better clinical outcomes. Scaling datasets further may enable adversarially trained dual-BN models to surpass standard models in accuracy.
Limitations
- Current models process 2D inputs; many medical imaging modalities are inherently 3D/4D. Adversarial training may become more difficult in higher-dimensional spaces, and results need replication in volumetric/time-resolved models.
- Slightly worse external generalization for some pathologies (e.g., cardiomegaly, edema, atelectasis) compared to standard models indicates sensitivity to distribution shift between datasets.
- Robust models typically require more training data (higher sample complexity) to generalize well; limited data settings (e.g., knee MRI, LUNA16) showed accuracy drops with single-BN adversarial training.
- Class imbalance (e.g., pneumonia at 2.4% in CheXpert) limited adversarial training performance and stability for certain pathologies.
- Reader studies assessed saliency interpretability but did not directly measure impact on clinical diagnostic accuracy or workflow outcomes.
Related Publications
Explore these studies to deepen your understanding of the subject.






