logo
ResearchBunny Logo
An Open-source Tool for Hyperspectral Image Augmentation in Tensorflow

Computer Science

An Open-source Tool for Hyperspectral Image Augmentation in Tensorflow

M. Abdelhack

An open-source TensorFlow extension enables image augmentation for hyperspectral satellite images, unlocking information beyond RGB channels and helping researchers prototype and deploy deep learning models for remote sensing. This research was conducted by Mohamed Abdelhack.... show more
Introduction

The paper addresses the gap between state-of-the-art deep learning tools designed for natural RGB images and the needs of hyperspectral satellite imagery in remote sensing. Satellite images differ in distribution and have many more spectral channels, making direct reuse of pretrained models and standard augmentation pipelines suboptimal. TensorFlow’s native image generator supports only up to four channels, limiting hyperspectral workflows. The study proposes and demonstrates an open-source augmentation tool that enables hyperspectral image augmentation within TensorFlow/Keras, aiming to facilitate development, testing, and deployment of deep learning models for remote sensing applications, and to broaden access for researchers using open satellite data (e.g., Copernicus).

Literature Review

Prior works show rapid adoption of deep learning in remote sensing (e.g., Ma et al., 2019) and successful use of image augmentation in natural images (e.g., Cireșan et al., 2011; Simard et al., 2003) and for hyperspectral/satellite tasks (e.g., Scott et al., 2017; Qiu et al., 2019). However, many studies that utilized augmentation for hyperspectral images implemented bespoke tooling, and no widely available open-source augmentation toolkit for hyperspectral data integrated into TensorFlow existed. Speckle noise is recognized as relevant for satellite/SAR imagery and has inspired augmentation or model robustness strategies (e.g., Kwak et al., 2018).

Methodology

Tool: An image generator for hyperspectral data was developed for TensorFlow/Keras using Python and scikit-image, extending earlier examples (Leitloff & Riese, 2018). It supports horizontal and vertical flipping, rotation, translation, zooming, shearing, and addition of speckle (multiplicative) noise. Speckle noise augmentation was added to better simulate satellite-specific noise. The tool is open-source (MIT license) and available on GitHub. When augmentation expands beyond image boundaries, edge pixels are repeated to pad and preserve size.

Geospatial input support: The tool can ingest JPEG 2000 satellite images together with shapefiles containing point coordinates (centroids) for locations of interest. Users specify patch size and augmentation options; the tool crops patches on-the-fly during DNN training.

Model: A VGG19 CNN was used as a simple test case. The input layer was expanded to 13 channels to accommodate Sentinel-2 bands. The fully connected stack was replaced by three layers of sizes 2048×2048, 2048×1024, and 1024×10 (the 10 units correspond to class labels). Implemented in TensorFlow with Keras.

Dataset: EuroSAT (Sentinel-2) hyperspectral images (64×64 with 13 bands), ten land-use/land-cover classes, per Helber et al., 2019.

Augmentation tests: Each augmentation was tested individually against a no-augmentation baseline. Parameters: flips (horizontal/vertical), rotation up to 90°, zoom in/out up to factor 1.5, translation up to 25% of image size, shearing up to 5%. Speckle noise with zero mean and variance 0.010.

Training setup: Each model was trained for 10 epochs, 500 batches/epoch, batch size 128, using Adam (learning rate 0.001) with categorical cross-entropy. Training set size: 21,000 samples; test set: 9,000 samples. Without augmentation, each training sample is reused on average ~30 times across training. Total trainable parameters: 21,025,226.

Key Findings
  • All augmentation techniques increased test accuracy compared to no augmentation, except translation which decreased performance.
  • Numerical results (best epoch, training/testing accuracy):
    • No augmentation: epoch 7; train 86.72%; test 87.66%.
    • Horizontal/vertical flips: epoch 9; train 92.79%; test 93.43% (highest test accuracy).
    • Zoom (factor up to 1.5): epoch 9; train 89.16%; test 88.59%.
    • Translation (up to 25%): epoch 7; train 66.05%; test 70.00% (degradation).
    • Rotation (up to 90°): epoch 9; train 83.76%; test 90.00%.
    • Shearing (up to 5%): epoch 7; train 93.99%; test 90.93%.
    • Speckle noise (variance 0.010): epoch 8; train 92.18%; test 91.25%.
  • Speckle noise augmentation appears promising for satellite data and may help generalization across imagery from different satellites.
Discussion

Results indicate that standard augmentation methods for natural images remain beneficial for hyperspectral satellite imagery, with substantial gains from flips, shears, rotations, and speckle noise. The observed drop in performance for translation is attributed to the edge-replication padding strategy, which may distort content after shifts. To mitigate this, the tool was extended to extract patches from larger tiles via geospatial coordinates so that translations include real surrounding pixels instead of padded edges. The inclusion of speckle noise captures satellite-specific noise characteristics and improved test accuracy, suggesting better robustness and potential cross-sensor generalization. Overall, the tool enables effective augmentation within TensorFlow for >4-channel images, addressing a practical bottleneck in remote sensing deep learning workflows.

Conclusion

The paper introduces an open-source TensorFlow/Keras-compatible augmentation tool for hyperspectral satellite images, enabling common geometric transforms and speckle noise augmentation, as well as on-the-fly patch extraction from JPEG 2000 imagery using shapefiles. Empirical tests on EuroSAT with a 13-channel VGG19 variant show that most augmentations improve accuracy over a non-augmented baseline, with flips achieving the highest test accuracy and speckle noise also providing strong gains. Future work should explore satellite-specific augmentation methods and improved translation strategies using larger tiles to avoid harmful padding, further enhancing accuracy and generalizability.

Limitations
  • Evaluation limited to a single dataset (EuroSAT) and a single baseline architecture (modified VGG19) trained for 10 epochs, which may constrain generalizability of the findings.
  • Translation performance likely affected by the chosen padding method (edge replication), suggesting that results for translation may improve with larger-tile extraction methods.
  • The study assesses single-augmentation effects individually; combinations and broader hyperparameter sweeps were not reported.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny