Deep Generative Video Compression

 

We propose an end-to-end, deep probabilistic modeling approach to compress low-resolution videos. Our approach builds upon variational autoencoder (VAE) models for sequential data and combines them with recent work on neural image compression.

November 4, 2019
NeurIPS 2019

 

Authors

Jun Han (Dartmouth College)

Salvator Lombardo (Disney Research LA)

Christopher Schroers (DisneyResearch|Studios)

Stephan Mandt (University of California, Irvine)

 

Deep Generative Video Compression

Abstract

We propose an end-to-end, deep probabilistic modeling approach to compress low-resolution videos. Our approach builds upon variational autoencoder (VAE) models for sequential data and combines them with recent work on neural image compression. The approach jointly learns to transform the original video into a lower-dimensional representation as well as to entropy code this representation according to the predictions of the sequential VAE. We split the latent space into local (per frame) and global (per segment) variables and show that training the VAE to utilize both representations leads to an improved rate-distortion tradeoff. Evaluations on small videos from public data sets with varying complexity and diversity show that our model yields competitive results when trained on generic video content. Extreme compression performance is achieved when training the model on specialized content.

Copyright Notice