logo
ResearchBunny Logo
Enhancing Object Detection Robustness: A Synthetic and Natural Perturbation Approach

Computer Science

Enhancing Object Detection Robustness: A Synthetic and Natural Perturbation Approach

N. Premakumara, B. Jalaian, et al.

Discover the cutting-edge research by Nilantha Premakumara, Brian Jalaian, Niranjan Suri, and Hooman Samani, which explores how synthetic perturbations can boost the robustness of object detection models against real-world challenges such as varying lighting and blur. This study sheds light on how these advancements can lead to more reliable detection systems.

00:00
00:00
~3 min • Beginner • English
Introduction
Object detection, a fundamental and critical problem in computer vision, aims to identify and spatially localize objects within images or videos. It underpins tasks such as object tracking, activity recognition, image captioning, segmentation, and visual question answering. Object detection is challenging due to high intra-class and low inter-class variance. Recent advances in deep learning have driven progress across computer vision and related domains, and as these techniques move into safety-critical applications (autonomous vehicles, medical diagnostics, robotics), ensuring reliability and trustworthiness is paramount. Real-world deployments do not guarantee high-quality inputs; thus robustness to distribution shifts must be evaluated before deployment. Prior robustness interventions include alternative architectures, data additions, losses, and optimizers, often targeting specific shifts (noise or synthetic corruptions). This work presents a novel approach to assessing robustness by comparing performance under synthetic and real natural perturbations. Four models (Detr-ResNet-101, Detr-ResNet-50, YOLOv4, YOLOv4-tiny) are evaluated using COCO 2017 and synthetic perturbations via AugLy to approximate natural perturbations, then tested on real perturbations using ExDark. A comprehensive ablation study retrains models with synthetic perturbations and evaluates robustness on ExDark. Contributions: (1) Systematically identify the optimal level of synthetic perturbation that enhances robustness, highlighting benefits of synthetic data augmentation for real-world deployment. (2) Provide an ablation study establishing a connection between synthetic augmentation and robustness to real-world distribution shifts. The findings offer insights for developing more reliable object detection models tailored for real-world applications.
Literature Review
The robustness of CNN-based object detectors to noise and adverse conditions is critical, particularly for surveillance with challenging image quality. Studies indicate noisy images degrade classification; synthetic rain similarly impacts detectors as real rain. Approaches to simulate realistic weather (snow, fog) in datasets have been proposed to study reconstruction and robustness. Classifiers trained on ImageNet experience notable accuracy declines under natural perturbations; such perturbations also cause detection localization errors, reducing mAP for detectors like Faster R-CNN and R-FCN. Thermal surveillance studies assess detection/recognition techniques under night and weather variations. Semantic adversarial editing has been proposed to generate believable corruptions that expose challenging data points and improve robustness to natural corruptions. Dataset quality and availability strongly influence DNN performance. Several challenging datasets capture adverse conditions (e.g., ExDARK, UNIRI-TID, RESIDE, UFDD, See in the Dark), focusing on low light, weather, and occlusions. Data augmentation has been shown to enhance resilience using noise, deep artificial transformations, natural transformations, and combinations of simple image operations. AugLy, an open-source library with 100+ augmentations across modalities, mimics user edits seen on social platforms and supports robustness assessment and improvement. In this work, three image augmentations are used: blur, pixel degradation, and brightness. Overall, prior work underscores the need for robust models against natural data variance and corruptions, leveraging generative models, augmentation, and synthetic corruption benchmarks. Challenging datasets and augmentation libraries like AugLy are vital for assessing and enhancing robustness under adverse conditions. This research builds on these findings to investigate the impact of synthetic natural perturbations and the benefits of retraining on synthetic perturbations for improved robustness to real perturbations. A summarized comparison of methods addressing detection in challenging conditions (low light, domain adaptation, GAN-based day-night transfer, siamese networks with binary segmentation, RGB+thermal fusion with attention) highlights advantages such as context fusion and domain joining, alongside limitations like reliance on prior information, sensitivity to motion blur, and environment-dependent efficacy of thermal imaging.
Methodology
We evaluate robustness of four pretrained object detection models using COCO 2017 for training/augmentation and ExDark for evaluation under real perturbations, introducing synthetic perturbations with AugLy. Models: Detr-ResNet-101, Detr-ResNet-50, YOLOv4, YOLOv4-tiny. Datasets: COCO 2017 for training and creating synthetic perturbations; ExDark for evaluating robustness and performance under real-world perturbations in the ablation study. Synthetic perturbations: Three types—blur, brightness (light and dark moods), and pixel degradation—implemented via AugLy with parameter ranges to simulate natural perturbations. Multiple levels per category were applied to simulate distribution shifts; increased perturbation levels decreased detection confidence across models relative to original images. Ablation study design: We focus on brightness perturbations to (1) quantify the impact of varying levels of synthetic brightness on robustness and identify the optimal level, and (2) analyze transferability of improvements from synthetic to real perturbations. We retrain models on COCO subsets augmented with synthetic poor brightness at random percentages (0%, 20%, 50%, 70%) and evaluate on ExDark, examining how augmented training set size influences performance under real perturbations. We track mAP and loss across different original:synthetic ratios (100:0, 80:20, 50:50, 30:70).
Key Findings
- Synthetic perturbation sensitivity: Across synthetic perturbations (blur, brightness, pixel degradation), models were most susceptible to strong brightness increases; darkness was less harmful than other perturbations. - Optimal perturbation levels: Performance degraded when blur radius > 5, brightness (light) factor > 2, brightness (dark) factor < 0.2, and pixel degradation factor > 0.2. Detr-ResNet-101 showed the highest robustness among the four models across perturbation levels. - Model comparisons: Detr-ResNet-101 outperformed Detr-ResNet-50; YOLOv4 outperformed YOLOv4-tiny, consistently under synthetic and real perturbations. The best overall robustness was observed with Detr-ResNet-101 when trained with augmented synthetic perturbations. - Transfer from synthetic to real: Retraining with synthetic brightness perturbations improved robustness on ExDark (real low-light perturbations). As the percentage of synthetic augmentation increased, ExDark mAP increased and the performance gap between models narrowed. Ablation (COCO-testing mAP | ExDark mAP) by original:synthetic ratio: - Detr-ResNet-101: 100:0 → 73.56 | 51.85; 80:20 → 73.24 | 61.43; 50:50 → 75.23 | 70.56; 30:70 → 77.92 | 76.47. - Detr-ResNet-50: 100:0 → 62.37 | 48.28; 80:20 → 64.76 | 58.59; 50:50 → 63.47 | 62.58; 30:70 → 67.58 | 66.89. - YOLOv4: 100:0 → 71.49 | 49.48; 80:20 → 68.78 | 60.45; 50:50 → 70.37 | 61.57; 30:70 → 72.45 | 62.45. - YOLOv4-tiny: 100:0 → 64.52 | 32.45; 80:20 → 65.43 | 38.58; 50:50 → 67.84 | 42.59; 30:70 → 69.51 | 56.73. Overall, increasing synthetic augmentation improved ExDark mAP substantially, particularly for Detr-ResNet-101 and YOLOv4-tiny.
Discussion
The study addresses the core question of whether synthetic perturbations can enhance robustness to real-world distribution shifts in object detection. Results show a clear, monotonic improvement in ExDark performance as the proportion of synthetic brightness-perturbed images in training increases, indicating effective transfer of robustness from synthetic to real perturbations. This validates synthetic augmentation (especially brightness perturbations) as a practical proxy for natural low-light conditions. The identification of perturbation parameter thresholds (e.g., blur radius ≤ 5; brightness light factor ≤ 2; brightness dark factor ≥ 0.2; pixel degradation ≤ 0.2) provides actionable guidance for constructing augmentation policies that improve robustness without overly degrading training data quality. Model-wise, Detr-ResNet-101 consistently exhibits superior robustness, while YOLOv4 outperforms YOLOv4-tiny, aligning with expectations about capacity and feature representation. These findings are relevant for deploying detectors in adverse conditions: practitioners can incorporate tuned synthetic perturbations into training to narrow the performance gap observed under distribution shifts, prioritize higher-capacity models when feasible, and select augmentation parameters that best reflect expected operational conditions.
Conclusion
This study evaluated robustness of four object detectors (Detr-ResNet-101, Detr-ResNet-50, YOLOv4, YOLOv4-tiny) to natural perturbations by simulating synthetic perturbations with AugLy and testing transfer to real low-light conditions using ExDark. Experiments examined performance under synthetic and real perturbations, the effect of augmented training set size, and optimal perturbation levels. Key outcomes: augmenting training with synthetically perturbed images markedly improves robustness to real-world perturbations, particularly under challenging lighting. Detr-ResNet-101 demonstrated the strongest robustness overall, though model choice should consider computational constraints, speed-accuracy trade-offs, and domain-specific perturbations. Future work should broaden datasets and perturbation modalities, include more architectures, expand ablations beyond brightness to additional perturbations, and explore alternative augmentation strategies and parameter tuning to further improve robustness across diverse real-world scenarios.
Limitations
- Dataset scope: Evaluation of real perturbations was limited to ExDark (low-light). Generalization to other natural perturbations (occlusions, weather, sensor noise) requires broader datasets. - Model scope: Only four pretrained detectors were studied; inclusion of additional architectures and specialized robust models could provide a more comprehensive view. - Perturbation focus: Ablation studies focused mainly on brightness; expanding to other synthetic perturbation types is needed for complete understanding. - Augmentation design: Results may depend on augmentation types and parameter choices; further exploration and optimization of augmentation policies are warranted.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny