Shape Transformers: Topology-Independent 3D Shape Models Using Transformers

 

We present a new nonlinear parametric 3D shape model based on transformer architectures.

April 25, 2022
Eurographics 2022

 

Authors

Prashanth Chandran (DisneyResearch|Studios/ETH Joint PhD)

Gaspard Zoss (DisneyResearch|Studios)

Markus Gross (DisneyResearch|Studios/ETH Zurich)

Paulo Gotardo (DisneyResearch|Studios)

Derek Bradley (DisneyResearch|Studios)

 

Shape Transformers: Topology-Independent 3D Shape Models Using Transformers

Abstract

Parametric 3D shape models (e.g., for faces) are heavily utilized in computer graphics and vision applications to provide priors on the observed variability of an object’s geometry. Original models were linear and operated on the entire shape at once. They were later enhanced to provide localized control on different shape parts separately. In deep shape models, nonlinearity was introduced via a sequence of fully-connected layers and activation functions, and locality was introduced in recent models that use mesh convolution networks. As common limitations, these models often dictate, in one way or another, the allowed extent of spatial correlations and also require that a fixed mesh topology be specified ahead of time. To overcome these limitations, we present a new nonlinear parametric 3D shape model based on transformer architectures. A key benefit of this new model comes from using the transformer’s “self-attention” mechanism to automatically learn nonlinear spatial correlations for a class of 3D shapes. This is in contrast to global models that correlate everything and local models that dictate the correlation extent. Our transformer 3D shape autoencoder is a better alternative to mesh convolution models, which require specially- crafted convolution, and down/up-sampling operators that can be difficult to design. Additionally, our model is topologically independent: it can be trained once and then evaluated on any mesh topology, unlike previous methods. We demonstrate the application of our model to different datasets, including 3D faces, 3D hand shapes and full human bodies. Our experiments demonstrate the strong potential of our transformer-based 3D shape model in several applications in computer graphics and vision.

Copyright Notice