logo
Loading...
Localization and recognition of human action in 3D using transformers
Computer ScienceCommunications Engineering

Localization and recognition of human action in 3D using transformers

J. Sun, L. Huang, et al.

Discover the groundbreaking BABEL-TAL dataset and the LocATe model that sets new benchmarks in 3D action localization! This innovative research, conducted by Jiankai Sun and colleagues, pushes the boundaries of 3D human behavior analysis with applications in HCI and healthcare.... show more
Abstract
Understanding a person's behavior from their 3D motion sequence is a fundamental problem in computer vision with many applications. An important component of this problem is 3D action localization, which involves recognizing what actions a person is performing, and when the actions occur in the sequence. To promote the progress of the 3D action localization community, we introduce a new, challenging, and more complex benchmark dataset, BABEL-TAL (BT), for 3D action localization. Important baselines and evaluating metrics, as well as human evaluations, are carefully established on this benchmark. We also propose a strong baseline model, i.e., Localizing Actions with Transformers (LocATe), that jointly localizes and recognizes actions in a 3D sequence. The proposed LocATe shows superior performance on BABEL-TAL as well as on the large-scale PKU-MMD dataset, achieving state-of-the-art performance by using only 10% of the labeled training data. Our research could advance the development of more accurate and efficient systems for human behavior analysis, with potential applications in areas such as human-computer interaction and healthcare.
Publisher
Communications Engineering
Published On
Sep 03, 2024
Authors
Jiankai Sun, Linjiang Huang, Hongsong Wang, Chuanyang Zheng, Jianing Qiu, Md Tauhidul Islam, Enze Xie, Bolei Zhou, Lei Xing, Arjun Chandrasekaran, Michael J. Black
Tags
3D action localizationBABEL-TALLocATetransformershuman behavior analysisHCIhealthcare
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 22+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny