T-GVC: Trajectory-Guided Generative Video Coding at Ultra-Low Bitrates

Zhitao Wang1, Hengyu Man1*, Wenrui Li1, Xingtao Wang1, Xiaopeng Fan1,2,3, Debin Zhao1
1Harbin Institute of Technology
2Harbin Institute of Technology Suzhou Research Institute
3Peng Cheng Laboratory
📧 zhitao.wang.hit@outlook.com, manhengyu@hotmail.com, liwr618@163.com,
{xtwang, fxp, dbzhao}@hit.edu.cn

Abstract

Recent advances in video generation techniques have given rise to an emerging paradigm of generative video coding for ultra-low bitrate (ULB) scenarios leveraging strong generative priors. However, most existing methods are limited by domain specificity (e.g., facial or human videos) or excessive dependence on high-level text guidance, which often fails to capture motion details and results in unrealistic reconstructions.

To address these challenges, we propose a Trajectory-Guided Generative Video Coding framework (dubbed T-GVC). T-GVC employs a semantic-aware sparse motion sampling pipeline to effectively bridge low-level motion tracking with high-level semantic understanding by extracting pixel-wise motion as sparse trajectory points based on their semantic importance. This not only significantly reduces the bitrate but also preserves critical temporal semantic information.

In addition, by incorporating trajectory-aligned loss constraints into diffusion processes, we introduce a training-free latent space guidance mechanism to ensure physically plausible motion patterns without sacrificing the inherent capabilities of generative models. Experimental results demonstrate that our framework outperforms both traditional and neural video codecs under ULB conditions. Furthermore, additional experiments confirm that our approach achieves more precise motion control than existing text-guided methods, paving the way for a novel direction of generative video coding guided by geometric motion modeling.

Visual Results

Comparative Results

T-GVC demo 1: BasketballDrill
T-GVC demo 2: YachtRide
T-GVC demo 3: HoneyBee
VTM demo 1: BasketballDrill
VTM demo 2: YachtRide
VTM demo 3: HoneyBee
DCVC-FM demo 1: BasketballDrill
DCVC-FM demo 2: YachtRide
DCVC-FM demo 3: HoneyBee

Ablation Study

demo 1: BasketballDrill (no guidance)
demo 2: BasketballDrill (text-guided)
demo 3: BasketballDrill (trajectories+text-guided)