Neural Inter-Frame Compression for Video Coding

 

In this work we present an inter-frame compression approach for neural video coding that can seamlessly build up on different existing neural image codecs.

October 27, 2019
International Conference on Computer Vision (ICCV) 2019

 

Authors

Abdelaziz Djelouah (DisneyResearch|Studios)

Joaquim Campos (DisneyResearch|Studios Intern)

Simone Schaub-Meyer (DisneyResearch|Studios/ETH Joint PhD)

Christopher Schroers (DisneyResearch|Studios)

 

Neural Inter-Frame Compression for Video Coding

Abstract

While there are many deep learning based approaches for single image compression, the field of end-to-end learned video coding has remained much less explored. Therefore, in this work we present an inter-frame compression approach for neural video coding that can seamlessly build up on different existing neural image codecs. Our end-to-end solution performs temporal prediction by optical flow based motion compensation in pixel space. The key insight is that we can increase both decoding efficiency and reconstruction quality by encoding the required side information into a latent representation that directly decodes into motion and blending coefficients. In order to account for remaining prediction errors, residual information between the original image and the interpolated frame is needed. We propose to compute residuals directly in latent space instead of in pixel space as this allows to reuse the same image compression network for both key frames and intermediate frames. This has the advantage of making our video coding approach, more coherent, more memory efficient, and easier to train. Our extended evaluation on different datasets and resolutions shows that the rate-distortion performance of our approach is competitive with existing state-of-the-art codecs.

Copyright Notice