ScaffoldAvatar: High-Fidelity Gaussian Avatars with Patch Expressions

In this work, we propose to couple locally-defined facial expressions with 3D Gaussian splatting to enable creating ultra-high fidelity, expressive and photorealistic head avatars.

July 10, 2025
SIGGRAPH (2025)

Authors

Shivangi Aneja (Technical University of Munich, DisneyResearch|Studios)

Sebastian Weiss (DisneyResearch|Studios)

Irene Baeza (DisneyResearch|Studios)

Prashanth Chandran (DisneyResearch|Studios)

Gaspard Zoss (DisneyResearch|Studios)

Matthias Niessner (Technical University of Munich)

Derek Bradley (DisneyResearch|Studios)

ScaffoldAvatar: High-Fidelity Gaussian Avatars with Patch Expressions

Download Publication PDF

Abstract

Generating high-fidelity real-time animated sequences of photorealistic 3D head avatars is important for many graphics applications, including immersive telepresence and movies. This is a challenging problem particularly when rendering digital avatar close-ups for showing character’s facial microfeatures and expressions. To capture the expressive, detailed nature of human heads, including skin furrowing and finer-scale facial movements, we propose to couple locally-defined facial expressions with 3D Gaussian splatting to enable creating ultra-high fidelity, expressive and photorealistic head avatars. In contrast to previous works that operate on a global expression space, we condition our avatar’s dynamics on patch-based local expression features and synthesize 3D Gaussians at a patch level. In particular, we leverage a patch-based geometric 3D face model to extract patch expressions and learn how to translate these into local dynamic skin appearance and motion by coupling the patches with anchor points of Scaffold-GS, a recent hierarchical scene representation. These anchors are then used to synthesize 3D Gaussians on-the-fly, conditioned by patch-expressions and viewing direction. We employ color-based densification and progressive training to obtain high-quality results and faster convergence for high resolution 3K training images. By leveraging patch-level expressions, ScaffoldAvatar consistently achieves state-of-the-art performance with visually natural motion, while encompassing diverse facial expressions and styles in real time.

Copyright Notice

The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.

ScaffoldAvatar: High-Fidelity Gaussian Avatars with Patch Expressions

In this work, we propose to couple locally-defined facial expressions with 3D Gaussian splatting to enable creating ultra-high fidelity, expressive and photorealistic head avatars.

Authors

ScaffoldAvatar: High-Fidelity Gaussian Avatars with Patch Expressions

Abstract

Copyright Notice

Research at Disney

Legal

MORE