logo
ResearchBunny Logo
Intelligent metasurface system for automatic tracking of moving targets and wireless communications based on computer vision

Engineering and Technology

Intelligent metasurface system for automatic tracking of moving targets and wireless communications based on computer vision

W. Li, Q. Ma, et al.

Discover an innovative metasurface system that combines computer vision and smart beam tracking for dynamic target tracking and seamless wireless communication. This groundbreaking research was conducted by Weihan Li, Qian Ma, Che Liu, Yunfeng Zhang, Xianning Wu, Jiawei Wang, Shizhao Gao, Tianshuo Qiu, Tonghao Liu, Qiang Xiao, Jiaxuan Wei, Ting Ting Gu, Zhize Zhou, Fashuai Li, Qiang Cheng, Lianlin Li, Wenxuan Tang, and Tie Jun Cui.

00:00
00:00
~3 min • Beginner • English
Introduction
In the 5G era with massive device connectivity, demands for IoT-enabled positioning and tracking are acute. Traditional radar-based tracking can be inefficient due to complex, variable electromagnetic environments and the cost, complexity, and size of radar systems. More flexible hardware, faster information processing, and advanced communication theory are needed. Metamaterials and metasurfaces provide engineered control over electromagnetic waves with planar, low-loss structures. Digital coding and programmable metasurfaces, using discretized reflection phases controlled by active devices (e.g., PIN diodes, varactors), enable dynamic manipulation of waves, including polarization, amplitude, and transmission/reflection controls for applications like imaging, space-time modulation, and communications. While most programmable metasurfaces are human-controlled and prior self-adaptive systems focused on predesigned functions, integrating AI optimization and deep learning with metasurfaces promises real-time, self-adaptive responses. Concurrently, computer vision techniques (e.g., detection and tracking) can provide intuitive, reliable, and cost-effective target state estimation. This work combines computer-vision-based detection/tracking (YOLOv4-tiny) with a dual-polarized digital programmable metasurface driven by a pre-trained ANN to achieve real-time, accurate beam steering toward moving targets and simultaneous wireless communications, addressing challenges in 5G/6G networks.
Literature Review
The paper situates its contribution within several strands of prior work: (1) Metamaterials/metasurfaces enabling exotic wave phenomena and applications, such as negative refraction, holography, absorbers, Huygens surfaces, and nonreciprocal/reconfigurable components, with broad demonstrations across photonics, terahertz, and microwave engineering. (2) Digital coding and programmable metasurfaces that discretize reflection phases and, via integrated active devices, achieve dynamic control over polarization, amplitude, and transmission/reflection, enabling imaging, space-time coding, and wireless communications. (3) Initial self-adaptive metasurface systems using AI/deep learning for specified functionalities (e.g., cloaking) and optimization methods (e.g., PSO, hybrid genetic algorithms) for antenna/metasurface synthesis. (4) Rapid advances in AI-enabled computer vision for detection, classification, localization, and tracking, including handling multi-object tracking and occlusions. The gap identified is the lack of integrated, real-time, self-adaptive metasurface systems that autonomously detect and track moving targets and simultaneously communicate with them in complex environments. This work addresses that gap by unifying computer vision (YOLOv4-tiny) with a dual-polarized programmable metasurface controlled by a pre-trained ANN to realize intelligent beam tracking and communications.
Methodology
System architecture: A dual-polarized 1-bit digital programmable metasurface (DPM) is paired with an Intel RealSense Depth Camera D435i (RS-Camera). The RS-Camera acquires RGB-D frames at up to 40 FPS. A YOLOv4-tiny CNN processes each frame (input resized to 608×608×3) using CSPDarknet53-tiny backbone with Mosaic augmentation and SPPNet, producing detections and estimating the target’s elevation and azimuth angles relative to the camera. Camera and metasurface coordinate systems are unified to map angles accurately to beam directions. DPM design: Each metasurface element is a dual-linearly polarized, 1-bit unit with two PIN diodes controlling x- and y-polarized reflection phases. Geometry: a = 25 mm, b = 11.5 mm, c1 = 6 mm, c2 = 2.8 mm, d = 1.5 mm; substrates: F4B εr = 3, tanδ = 0.003, with h1 = 3 mm (top), h2 = 1 mm (bottom), and a metallic sheet between. A backside sector structure (r = 5 mm, β = 120°) provides RF isolation and DC biasing. CST simulations characterize four switching states (00, 01, 10, 11). For x-polarized incidence, reflected amplitude exceeds ~0.8 near 5.8 GHz, and ON/OFF states yield ~180° phase difference around 5.8 GHz. The aperture comprises 18×18 = 324 elements. Independent dual-polarization bias lines are printed on opposite sides of the PCB for separate control. Beam steering control: Digital coding sequences are derived to steer beams across the E-plane from −40° to 40° in 10° steps. An FPGA (Zynq-7000 SoC) applies voltage sequences to elements in real time. Due to the limited aperture (324 elements), sidelobes exist but are sufficiently suppressed for directive beams. Pre-trained ANN for code generation: A lightweight neural network (ResNet34-based feature extractor for 2D inputs or a fully connected network for 1D) maps input angles (theta, phi) to an N-bit coding sequence (multi-label, binary classification per element). The loss function is binary cross-entropy with sigmoid activation in the last layer; a 0.5 threshold binarizes outputs. The ANN is trained to emulate coding matrices with low sidelobes obtained from particle swarm optimization (PSO), enabling faster inference than nonlinear optimization or standard backprop-based search. The ANN is designed for robustness, lightweight deployment, and anti-interference. Control loop and sampling: The RS-Camera operates at 35 FPS for data collection; every third frame is processed to update the target position and coding sequence at 0.2 s sampling intervals. Elevation and azimuth precision can reach 1°, with 3° increments used in demonstrations; detected angles are rounded to the nearest integer. Experimental setups: Far-field beam patterns are measured in an anechoic chamber with a feeding horn (Keysight E8267D signal generator at 5.8 GHz) illuminating the DPM and a receiving antenna connected to a spectrum analyzer (Keysight E4447A) while rotating on a turntable. For tracking demonstrations, a VNA (Agilent N5230C) measures S21 between the feeding horn and a 5.77 GHz patch antenna representing the moving target on a model car. For RF signal detection, a portable detector (receiving patch, AD8317 detector, MCU/Arduino, battery) logs received power in real time. For wireless transmissions, a video module (5.65–5.95 GHz) modulates and sends video through the horn to the DPM; a receiver (antenna, decoder, screen) displays recovered frames. BER is measured with NI PXIe-5841 VST using QPSK at 170 Mbps.
Key Findings
- DPM element performance: At 5.8 GHz, simulated reflected amplitude >0.8 and ~180° phase difference between ON/OFF states for x-polarized incidence, ensuring efficient 1-bit phase control. - Beam steering: Measured E-plane beams scanned from −40° to 40° at 5.8 GHz with FPGA control. Peak gain decreased from 19.43 dB (broadside) to 15.54 dB at larger scan angles, with expected beam broadening due to reduced effective aperture; sidelobes present but sufficiently suppressed to maintain directive beams in both polarizations. - Closed-loop tracking rate and precision: Real-time tracking at 0.2 s sampling intervals, limited by camera processing; angular precision up to 1° (3° increments used for demo), with accurate alignment between RS-Camera detection and DPM beam direction after coordinate unification. - Moving target detection and identification: VNA S21 measurements showed received energy peaking when the car approached a fixed receiver and decreasing as it moved away; with the beam steered toward the target throughout motion. Multi-object tracking and occlusion handling were demonstrated using deep SORT and class-based selection; low-light operation achieved by switching from RS-Camera to a night-vision infrared-cut camera. - RF signal detection: With a fixed detector mid-path, received voltage (and calibrated dB) peaked when the car passed nearest the detector; when the detector moved with the car, received power remained high and relatively stable. Outdoor field trials with dual polarizations confirmed stable high received levels when the detector moved with the target, consistent with indoor results. - Real-time wireless transmissions: Two video transmission scenarios were demonstrated at 5.65–5.95 GHz. (1) Fixed receiver mid-path: video decodable only when the car approached the receiver, illustrating beam-directivity effects. (2) Receiver attached to car: continuous clear video during motion due to dynamic beam tracking. BER measurements with QPSK at 170 Mbps showed BER ≈ 1e-5 when the channel was within the acceptance region. The programmable, dual-polarized DPM helped mitigate interference in practical 5.8 GHz environments. - System integration: The pre-trained ANN produced coding sequences in milliseconds from detected angles, enabling autonomous, human-free operation across detection, tracking, and communication tasks.
Discussion
The integration of computer vision (YOLOv4-tiny with RS-Camera) and a pre-trained ANN-controlled dual-polarized DPM achieves the research goal of autonomous tracking of moving targets with simultaneous wireless communication. Vision-based detection supplies real-time position (elevation and azimuth), which the ANN rapidly maps to DPM coding sequences to steer beams toward the target. Experimental results verify accurate beam steering across a wide angular sector, robust tracking at practical sampling rates, and successful communication, including high-throughput video transmission and low BER under favorable channel conditions. The system manages complex scenarios such as multi-object presence, occlusions, and low-light conditions by augmenting the sensing pipeline (deep SORT, NV-camera), and the dual polarization plus programmability support operation in interference-prone environments. Together, these findings demonstrate a viable pathway to intelligent, self-adaptive metasurface-based wireless networks that can sense, decide, and act autonomously in dynamic radio environments.
Conclusion
This work presents an intelligent metasurface system that autonomously detects, tracks, and communicates with moving targets. A dual-polarized 1-bit DPM, driven by a pre-trained ANN, receives real-time target angles from a YOLOv4-tiny vision pipeline and steers directive beams accordingly. Experiments demonstrate dynamic beam scanning (−40° to 40°), robust closed-loop tracking at 0.2 s sampling, quantitative RF detection, and reliable video transmission with BER ~1e-5 at 170 Mbps in acceptance regions. The system operates without human intervention and adapts to challenging conditions (multi-object, occlusion, low light). This integrated approach advances intelligent wireless networks and self-adaptive metasystems. Future research directions include: increasing aperture size and bit depth for lower sidelobes and higher beamforming fidelity; accelerating sensing and control loops (higher FPS cameras, optimized inference) for faster target dynamics; extending to multi-user MIMO-like beam management; robust operation under rich multipath and interference; and integrating additional sensing modalities for enhanced situational awareness.
Limitations
- Sampling rate bottleneck: The closed-loop update is limited to ~0.2 s by the RS-Camera sampling/processing (processing every third frame at ~35 FPS), constraining tracking of faster targets. - Aperture size and sidelobes: The 18×18 (324-element) 1-bit DPM exhibits sidelobes due to limited aperture and quantization; beam gain decreases and beamwidth widens at larger scan angles as effective aperture reduces. - Manufacturing constraints: Element period is relatively large at 5.8 GHz and fabrication complexity limits overall metasurface size; bias routing on two sides increases complexity. - Frequency/interference environment: Operation around 5.8 GHz can experience interference; although dual polarization and programmability help, robustness in dense deployments requires further study. - Sensor alignment: RS-Camera not centered on the DPM aperture necessitates coordinate unification; residual alignment errors could affect pointing accuracy.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny