logo
ResearchBunny Logo
Spectrum-aware Multi-hop Task Routing in Vehicle-assisted Collaborative Edge Computing

Computer Science

Spectrum-aware Multi-hop Task Routing in Vehicle-assisted Collaborative Edge Computing

Y. Deng, H. Zhang, et al.

Discover an innovative multi-hop task offloading framework that redefines resource efficiency in vehicle-assisted Multi-access Edge Computing (MEC). This cutting-edge research by Yiqin Deng, Haixia Zhang, Xianhao Chen, and Yuguang Fang showcases how vehicles can form a dynamic data transportation network to enhance service throughput while maintaining critical performance constraints.... show more
Introduction

The paper addresses the limitations of one-hop computation offloading in MEC systems, where users typically offload tasks only to nearby ESs. Such single-hop designs can fail under constrained spectrum or compute resources at local ESs, especially for bandwidth- and compute-intensive applications (e.g., city-scale video analytics). The authors propose vehicle-assisted multi-hop task routing to expand the reachable set of ESs, enabling better load balancing across communication and computing resources and improved end-to-end latency guarantees. The research question is how to design spectrum-aware, multi-hop task routing and server selection to maximize system throughput (completed task size) while meeting end-to-end latency constraints in a dynamic vehicular environment with uncertain mobility and variable channels. The purpose is to jointly optimize routing paths (via vehicles) and ES assignment under practical constraints, and to learn effective policies despite the intractability of exact models. The study is important because multi-hop offloading can enhance resource sharing, mitigate congestion, and improve QoS in realistic MEC deployments where spectrum/computing become bottlenecks at the edge.

Literature Review

The related work is organized into three areas: (1) Single-ES MEC: Works such as [8]-[10] optimize computation partitioning, service placement, and resource allocation assuming one-hop MD-to-ES connectivity and often abundant spectrum or simplified queueing, typically without ES collaboration. (2) Cooperative MEC across ESs: Studies like [11]-[12] allow ESs to share tasks via backhaul/backbone links to improve QoE, focusing on admission/scheduling and partial offloading; however, they usually assume ample inter-ES bandwidth and still one-hop MD-to-ES uplinks. (3) Multi-hop MD-to-ES offloading: Few works consider multi-hop. Hui et al. [29] propose trusted relay selection for vehicular networks but ignore resource/QoS constraints. The authors’ prior work [21] uses vehicle-assisted relays for load balancing but only in a single-MD scenario without complex multi-user routing. Other literature highlights the importance of multi-hop relaying for reliability in 5G NR V2X and public safety (e.g., 3GPP Release 17 studies), suggesting feasibility of multi-hop vehicle relays. Overall, existing approaches largely overlook joint multi-hop routing with spectrum/computing constraints and end-to-end latency guarantees in dynamic vehicular environments, motivating a DRL-based solution to handle the curse of dimensionality and model uncertainty.

Methodology

System and framework: A vehicle-assisted MEC system with multiple MDs, vehicles, and multiple ESs is considered in a slotted timeline. Users can offload tasks to an ES within one hop or via multi-hop relays formed by vehicles. A centralized controller (e.g., SDN) has global knowledge and decides, per time slot, each MD’s ES and implicitly determines a unique path (e.g., shortest travel-distance path via vehicles to the ES). The service session comprises offloading (MD to first relay vehicle), relaying (vehicle-to-vehicle hops), uploading (last relay vehicle to ES), and computing at the ES. The return of results is ignored due to their small size in many applications. Channel and latency models: Path loss follows a 3GPP-based model with distance and carrier frequency; link rate follows Shannon capacity with bandwidth B, transmit power, and Gaussian noise. For an MD’s task of size W_i(t), the per-link transmission latency is W_i(t)/R_ab(t) (MD-to-first-vehicle, each V2V hop, and last-vehicle-to-ES). The end-to-end transmission latency sums over these segments. Computing latency at ES j is κ W_i(t)/C_j, where κ is cycles per bit and C_j is ES compute rate. Queueing latency at ES j accounts for uncompleted workloads from prior slots. The total end-to-end latency is the sum of transmission, computing, and queueing latencies. Optimization problem: Objective is to maximize aggregated throughput, defined as the total size of tasks completed within their deadlines over the horizon, subject to communication (spectrum) and computing resource constraints and per-task end-to-end latency D_i. The decision variables indicate per-MD path/ES selection per slot, with at most one route per MD per slot. This yields a mixed-integer, nonlinear, high-dimensional problem with intractable end-to-end latency characterization due to multi-hop, non-Poisson arrivals at intermediate nodes, and dynamic topology. MDP formulation: To address model uncertainty and dimensionality, the problem is cast as an MDP and solved via model-free DRL. State includes vehicle status (feasible relays and inter-vehicle channels), ES status (compute capability and available bandwidth), and system workload (incoming task amounts and ES queue backlogs). Actions select, per MD, a destination ES; the multi-hop path is uniquely determined (e.g., shortest path). The reward aligns with the objective: per-MD immediate reward equals the size of tasks completed within deadline in the slot (including newly arrived and queued tasks). The goal is to maximize discounted cumulative reward. DRL approach (MADDPG): The authors adopt Multi-Agent Deep Deterministic Policy Gradient, decomposing the problem so that each MD is an agent. During training, critic networks use global observations (states and actions of all agents) to stabilize learning, while actor networks use local observations for execution. DDPG-style actor and critic with target networks are used, with soft updates controlled by τ and exploration via Gaussian noise added to actor outputs. Losses: critic minimizes squared TD error; actor maximizes Q by policy gradient. Training uses experience replay. The approach learns dynamic ES selection and hence the associated multi-hop routing to balance spectrum and compute across the network. Simulation setup: A road network with four ESs and moving vehicles (speed limit 60 km/h, safe distance 4 m) is simulated for 100 s. MDs are randomly placed (fixed over time), tasks arrive per Poisson with task sizes uniformly in [2,5]×10^5 Kbits; β denotes per-slot task generation probability. Spectrum is fairly allocated proportionally to transmitted task sizes. Key parameters include ES coverage radius 200 m, antenna height 1.5 m, carrier frequency 2.8 GHz, bandwidth 5 MHz, κ=1200 cycles/bit, ES compute rates [1,2,3,4]×10^7 cycles/s, MD transmit power 1 W, noise power 5×10^-13 W, discount factor 0.99. Networks: actor uses 4-layer MLP with two hidden layers of 256 units (sigmoid), critic processes state and action streams then merges with two 256-unit ReLU layers; learning rates 0.001 (actor) and 0.002 (critic), replay buffer 10^5, minibatch 64, target update 0.005, Gaussian exploration noise N(0.15, e^-2). Baselines: (i) Single-hop offloading with admission control (one-hop only); (ii) Multi-hop+Greedy selecting the nearest ES via multi-hop when local ESs are overloaded. Metrics: average throughput (completed task size per time) and success rate (ratio of completed to generated tasks per slot, averaged).

Key Findings
  • The proposed MADDPG-based multi-hop task routing consistently outperforms Single-hop and Multi-hop+Greedy across varied conditions (task arrival rates, number of MDs, number of vehicles, and number of ESs).
  • Throughput: MADDPG’s average throughput increases with higher task arrival rates and with more ESs, and remains relatively stable as the number of vehicles varies, indicating robustness to vehicular density changes. It also increases with the number of MDs in the tested range, benefitting from better load balancing.
  • Success rate: MADDPG achieves near-1 success rate at low arrival rates and approaches 1 as the number of ESs increases, whereas baselines show lower and more unstable success rates due to congestion and suboptimal ES selection.
  • Overall, the learning-based multi-hop strategy effectively balances both spectrum usage and computing workloads across ESs, delivering higher task completion under end-to-end latency constraints with low algorithmic complexity compared to naive multi-hop and one-hop strategies.
Discussion

The study demonstrates that enabling multi-hop, vehicle-assisted task delivery expands the feasible offloading space beyond one-hop ESs, improving resource sharing and load balancing under realistic spectrum and compute constraints. By framing the joint routing and ES selection as a multi-agent learning problem, MADDPG learns to trade off communication overheads and computing capacity dynamically, coping with mobility and channel variability. The gains over Single-hop arise from accessing additional ES resources beyond immediate coverage, while the gains over Multi-hop+Greedy stem from avoiding congestion and overloading by coordinated, reward-driven selection rather than nearest-ES heuristics. The results validate that spectrum-aware multi-hop routing is a practical approach to meet end-to-end latency objectives and maximize throughput in dynamic vehicular MEC networks.

Conclusion

The paper introduces a spectrum-aware, vehicle-assisted multi-hop task offloading framework for collaborative MEC and formulates a throughput maximization problem under end-to-end latency, spectrum, and computing constraints. It develops a MADDPG-based multi-agent offloading strategy that jointly selects ESs and implicitly determines multi-hop routes via vehicles, learning effective policies in a highly dynamic environment. Simulations confirm substantial improvements in task completion and success rate over one-hop and greedy multi-hop baselines, with robustness to variations in task load, MD population, vehicle density, and number of ESs. Potential future directions include incorporating result-return latencies and downlink constraints, decentralized or partially-observed training/execution with limited global information, integration of more realistic V2X communication standards and interference models, adaptive path selection beyond shortest-distance routing (e.g., reliability-aware routing), and real-world validations on vehicular testbeds.

Limitations
  • The return of computation results is ignored, assuming result sizes are small; this may understate end-to-end latency in applications with larger outputs.
  • A centralized controller with global network knowledge is assumed for decision making during training; practicality under partial observability or limited signaling is not evaluated.
  • The routing path is uniquely determined once the ES is chosen (e.g., shortest travel distance), which may not always align with instantaneous link quality or reliability.
  • The analytical characterization of end-to-end latency is not in closed form due to multi-hop and queueing complexity; performance is assessed via simulation.
  • Simulations use specific traffic models, spectrum allocation rules, and parameter settings; real-world variability (e.g., interference, non-ideal MAC, control overheads) is not explicitly modeled.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny