Researchers from TU Munich introduced Face Anything, a transformer-based model for 4D facial reconstruction and dense tracking from arbitrary image sequences. The method maps pixels to a normalized canonical facial coordinate space, enabling temporally consistent geometry and reliable correspondences in a single feed-forward pass. It jointly predicts depth and canonical coordinates, achieving approximately 3× lower correspondence error and 16% better depth accuracy than prior dynamic reconstruction methods. The model, dataset, and code will be publicly available.

2m read timeFrom 80.lv
Post cover image

Sort: