Attention-Driven Cropping for Very High Resolution Facial Landmark Detection

Facial landmark detection is a fundamental task for many consumer and high-end applications and is almost entirely solved by machine learning methods today.

June 16, 2020
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020

Authors

Prashanth Chandran (DisneyResearch|Studios/ETH Joint PhD)

Derek Bradley (DisneyResearch|Studios)

Thabo Beeler (DisneyResearch|Studios)

Markus Gross (DisneyResearch|Studios/ETH Zurich)

Attention-Driven Cropping for Very High Resolution Facial Landmark Detection

Download Publication PDF

Abstract

Facial landmark detection is a fundamental task for many consumer and high-end applications. Today, landmark detection is almost entirely solved by machine learning methods that are trained on a dataset of hand annotated images. Existing datasets are primarily made up of only low resolution images, and current algorithms are limited to inputs of comparable quality and resolution as the training dataset. On the other hand, high resolution imagery is becoming increasingly more common as consumer cameras improve in quality every year. Therefore, there is need for algorithms that can leverage the rich information available in high resolution imagery. Na{“i}vely attempting to reuse existing network architectures on high resolution imagery is prohibitive due to memory bottlenecks on GPUs. The only current solution is to downsample the images, sacrificing resolution and quality. Building on top of recent progress in attention-based networks, we present a novel, fully convolutional regional architecture that is specially designed for predicting landmarks on very high resolution facial images without downsampling. We demonstrate the flexibility of our architecture by training the proposed model with images of resolutions ranging from 256 x 256 to 4K. In addition to being the first method for facial landmark detection on high resolution images, our approach achieves superior performance over traditional (holistic) state-of-the-art architectures across ALL resolutions, leading to a general-purpose, extremely flexible, high quality landmark detector.

Copyright Notice

The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.

Attention-Driven Cropping for Very High Resolution Facial Landmark Detection

Facial landmark detection is a fundamental task for many consumer and high-end applications and is almost entirely solved by machine learning methods today.

Authors

Attention-Driven Cropping for Very High Resolution Facial Landmark Detection

Abstract

Copyright Notice

Research at Disney

Legal

MORE