Learning Video Object Segmentation from Static Images
Inspired by recent advances of deep learning in instance segmentation and object tracking, we introduce video object segmentation problem as a concept of guided instance segmentation.
July 22, 2017
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
Authors
Federico Perazzi (Disney Research/ETH Joint PhD)
Anna Khoreva (Max Planck Institute for Informatics, Saarbrucken)
Rodrigo Benenson (Max Planck Institute for Informatics Saarbrucken)
Bernt Schiele (Max Planck Institute for Informatics, Saarbrucken)
Alexander Sorkine-Hornung (Disney Research)
Learning Video Object Segmentation from Static Images
We demonstrate that highly accurate object segmentation in videos can be enabled by using a convnet trained with static images only. The key ingredient of our approach is a combination of offline and online learning strategies, where the former serves to produce a refined mask from the previous’ frame estimate and the latter allows to capture the appearance of the specific object instance. Our method can handle different types of input annotations: bounding boxes and segments, as well as incorporate multiple annotated frames, making the system suitable for diverse applications. We obtain competitive results on three different datasets, independently from the type of input annotation.