Computer Science

A Survey of Deep Anomaly Detection in Multivariate Time Series: Taxonomy, Applications, and Directions

F. Wang, Y. Jiang, et al.

This paper reviews recent deep learning techniques for multivariate time series anomaly detection, proposing a taxonomy of detection strategies, summarizing advantages and drawbacks, and organizing public datasets and application domains. It highlights challenges in modeling temporal dependencies and inter-variable relationships. This research was conducted by Authors present in <Authors> tag.

00:00

~3 min • Beginner • English

Index

Introduction

The paper addresses the need for robust anomaly detection in multivariate time series, where anomalies may arise across multiple variables with complex temporal and inter-variable dependencies. As traditional statistical and classical machine learning approaches struggle with high dimensionality and scarcity of labeled anomalies, the study explores deep learning-based solutions (e.g., Transformers, GNNs, VAEs, GANs, Diffusion) that can capture nonlinear relationships and long-range temporal correlations. The purpose is to classify anomaly types, propose a taxonomy of deep MTSAD methods, review their strengths and weaknesses, and catalog commonly used datasets, thereby guiding practitioners and researchers and highlighting promising directions for future work.

Literature Review

The survey situates MTSAD within a broader context of anomaly detection, reviewing traditional approaches including statistical methods (MA, ARIMA), classical machine learning (OCSVM, SVDD), and proximity-based techniques (distance-, distribution-, and density-based). It highlights their limitations for multivariate, high-dimensional, sparsely labeled settings. The review then synthesizes deep learning advances across forecasting-, reconstruction-, and contrastive-based strategies, covering CNNs/TCN, RNNs/LSTM/GRU, GNNs, Transformers, GANs, VAEs, and Diffusion models. It describes representative works, their methodological innovations (e.g., self-supervised masking, anomaly-attention, graph attention, active learning, time-frequency fusion), and notes domain applications and available benchmarks.

Methodology

As a survey, the methodology comprises: (1) formalizing the MTSAD problem and defining anomaly scoring; (2) proposing an anomaly type framework distinguishing intra-metric (temporal) anomalies—point-wise (global/local) and pattern-wise (shapelet, trend, cycle)—and inter-metric anomalies (global, local, temporal-local); (3) developing a taxonomy of deep MTSAD methods by learning paradigm (unsupervised, semi-, self-supervised) and model backbone (CNN/RNN/GNN/Transformer/AE/VAE/GAN/Diffusion/LLMs/MLP-Mixer) across three strategies (forecasting, reconstruction, contrastive); (4) outlining a general pipeline from data processing to representation learning and anomaly scoring; (5) compiling widely used datasets/benchmarks across application domains with characteristics (samples, entities, dimensions, anomaly rates). Pros and cons are analyzed per model family and strategy, and time/frequency domain considerations are discussed.

Key Findings

- Anomaly taxonomy: Clear delineation of temporal (point-wise global/local; pattern-wise shapelet/trend/cycle) and inter-metric anomalies (global, local, temporal-local), guiding method selection and evaluation. - Method taxonomy: Deep MTSAD methods grouped into forecasting, reconstruction, and contrastive strategies, each mapped to learning paradigms (Un/Semi/Self) and backbones (CNN, RNN, GNN, Transformer, AE/VAE/GAN/Diffusion, LLMs, MLP-Mixer). - Representative advances: • Forecasting: MTAD-GAT (GNN), GDN (GNN), AnomalyBERT/MAD/CLFormer (Transformers) improving temporal/spatial dependency modeling. • Reconstruction: OmniAnomaly/InterFusion/LARA (VAE-based), USAD/MSCRED/NPSR (AE-based), TadGAN/MIM-GAN/DAEMON/DCGAN (GAN-based), Anomaly Transformer/TranAD/MEMTO/Dual-TF/CATCH (Transformer, including time-frequency fusion), DiffusionAE and D³R (Diffusion) for robustness and controllable bottlenecks. • Contrastive: DCdetector/TRL-CPC/RESIST (Transformer), PatchAD (MLP-Mixer), and LLM-based approaches (AnomalyLLM, aLLM4TS) for representation learning without extensive labels. - Time-frequency integration improves point-level scoring and captures fine-grained characteristics (e.g., Dual-TF nested sliding windows; CATCH frequency patching and channel fusion). - Dataset landscape: Curated cross-domain resources including SMD (1,416,825 samples, 38 dims, server monitoring), PSM (132,480 samples, 25 dims, 27.76% anomaly rate), SMAP (562,800 samples, 25 dims), MSL (132,046 samples, 55 dims), SWaT (946,719 samples, 51 dims, 11.98% rate), WADI (957,372 samples, 123 dims, 5.99% rate), Kitsune (3,018,973 samples, 115 dims, cybersecurity), Creditcard (284,807 samples, 29 dims, 0.17% rate), among others, enabling comprehensive evaluation across 10+ domains. - Pros/cons synthesis: CNNs excel at local features; RNNs capture long-term dependencies but face scalability limits; GNNs model spatio-temporal relations with higher complexity; Transformers offer parallelism and long-range modeling but may need careful context handling; AE/VAE/GAN/Diffusion each present distinct trade-offs in stability, expressiveness, and robustness. - Future directions emphasized: discrepancy-focused learning, integration of domain/frequency knowledge, improved benchmarking and evaluation metrics, and leveraging LLMs/multimodal signals for enhanced generalization.

Discussion

By structuring anomaly types and method taxonomies, the survey clarifies the design space of MTSAD and maps model capabilities to anomaly characteristics and data regimes. The synthesis of forecasting, reconstruction, and contrastive strategies—together with analyses of temporal, spatial, and frequency perspectives—provides guidance on selecting appropriate backbones and learning paradigms under label scarcity and multivariate dependencies. The curated datasets and domain applications connect methodological advances to practical use cases. Discussion of pros/cons and emerging techniques (e.g., time-frequency fusion, memory modules, diffusion bottlenecks, LLM adaptation) highlights how current methods address core challenges—long-range dependencies, inter-variable relations, real-time constraints—and where gaps remain, informing research and deployment.

Conclusion

The paper contributes a comprehensive taxonomy of deep MTSAD methods, a systematic analysis of anomaly types, and an organized dataset compendium across application domains. It reviews 46+ deep models, assesses their strengths and limitations, and outlines promising directions: discrepancy-centric learning, integrating statistical/frequency-domain knowledge with deep models, developing realistic benchmarks and intuitive metrics, and exploiting LLMs and multimodal embeddings to bridge normal–anomalous distributions. These insights aim to advance accurate, robust, and scalable anomaly detection in complex multivariate time series systems.

Limitations

As a survey, the work does not present a unified experimental benchmark directly comparing all reviewed methods under standardized settings, nor does it introduce new datasets or metrics. Coverage is bounded by accessible literature up to late 2024 and may omit very recent or niche approaches. Reported dataset characteristics and model properties reflect sources that may vary in preprocessing, labeling quality, and evaluation protocols, potentially limiting generalizability across domains.

Related Publications

Explore these studies to deepen your understanding of the subject.

Computer Science

A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery

Y. Zhang, X. Chen, et al.

Psychology

Frequency of depression and correlates among Chinese children and adolescents living in poor areas under the background of targeted poverty alleviation: results of a survey in Weining county

X. Chen, X. Yuan, et al.

Medicine and Health

The role of the private sector in noncommunicable disease prevention and management in low-and middle-income countries: a series of systematic reviews and thematic syntheses

K. Marshall, P. Beaden, et al.

Medicine and Health

Design and Analysis of a Deep Learning Ensemble Framework Model for the Detection of COVID-19 and Pneumonia Using Large-Scale CT Scan and X-ray Image Datasets

X. Xue, S. Chinnaperumal, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny