Generative A-Eye #4 - 19th Sept,2024
A (more or less) daily newsletter featuring brief summaries of the latest papers related to AI-based human image synthesis, or to research related to this topic.
A short one today, with little of interest in regard to human synthesis - with one major exception (see below).
Only in yesterday’s newsletter, I was reflecting doubt on my recent enthusiasm for Gaussian Splatting as a viable replacement for autoencoder-based human cross-enactment (i.e., production-level deepfaking/de-aging, etc.). Almost immediately a new paper hoves into view that refreshes this interest…
GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations
This new offering from Max Planck and Flawless AI offers a system that, typical of this year’s research, relies on the CGI-based interstitial framework FLAME to act as a bridge between input footage and the placement of Gaussian Splats on the target.
Sadly the code repository is currently empty - but hey, once in a blue moon, researchers actually post the code in the end!
(The clip below is a brief excerpt of the main video at the project page for GaussianHeads. If you are reading it in an email newsletter, clicking the link will take you to SubStack, where you can play the video)
'[A] new method to generate highly dynamic and deformable human head avatars from multi-view imagery in real-time. At the core of our method is a hierarchical representation of head models that allows to capture the complex dynamics of facial expressions and head movements. First, with rich facial features extracted from raw input frames, we learn to deform the coarse facial geometry of the template mesh. We then initialize 3D Gaussians on the deformed surface and refine their positions in a fine step. We train this coarse-to-fine facial avatar model along with the head pose as a learnable parameter in an end-to-end framework. This enables not only controllable facial animation via video inputs, but also high-fidelity novel view synthesis of challenging facial expressions, such as tongue deformations and fine-grained teeth structure under large motion changes'
http://export.arxiv.org/abs/2409.11951
https://vcai.mpi-inf.mpg.de/projects/GaussianHeads/
Other papers of interest today:
Finding the Subjective Truth: Collecting 2 Million Votes for Comprehensive Gen-AI Model Evaluation
http://export.arxiv.org/abs/2409.11904
_________________________
My domain expertise is in AI image synthesis, and I’m the former science content head at Metaphysic.ai. I’m an AI developer, current machine learning practitioner, and an educator. I’m also a native Brit, currently resident in Bucharest, but possibly interested in relocation.
If you want to see more extensive examples of my writing on research, as well as some epic features, many of which hit big at Hacker News and garnered significant traffic, check out my portfolio website at https://martinanderson.ai.