Publications
Papers
International Journal of Computer Vision: Beyond Fixed Topologies: Unregistered Training and Comprehensive Evaluation Metrics for 3D Talking Heads has been accepted to International Journal of Computer Vision (IJCV, top computer vision journal) Special Issue on Audio-Visual Generation!
We introduce ScanTalk, a speech-driven 3D talking-head system that can animate any 3D face mesh topology—including real 3D scans—without requiring fixed vertex correspondence.
Key contributions:
• Topology-agnostic animation: ScanTalk is the first framework that animates 3D faces regardless of mesh topology.
• Unregistered / fully unsupervised training: While the Chamfer distance is widely used as a loss for unsupervised static 3D reconstruction, we propose to use a dynamic extension of this loss for a complete unsupervised setting to learn speech-driven motion prediction.
• Better evaluation: we highlight and analyze limitations of existing benchmark metrics for evaluating lip-sync and generation quality, and introduce new, more comprehensive metrics. Using these, we provide an extended evaluation of ScanTalk against state-of-the-art methods.
European Conference on Computer Vision: Federico Nocentini, Thomas Besnier, Claudio Ferrari, Sylvain Arguillere, Stefano Berretti, Mohamed Daoudi - ScanTalk: 3D Talking Heads from Unregistered Scans. Proceedings of the European Conference on Computer Vision (ECCV), 2024. Project page - Code
- IEEE conference series on Automatic Face and Gesture Recognition: Filippo Principi, Stefano Berretti, Claudio Ferrari, Naima Otberdout, Mohamed Daoudi, Alberto Del Bimbo - The Florence 4D Facial Expression Dataset. FG 2023: 1-6
- IEEE Transaction on Affective Computing: Naima Otberdout, Claudio Ferrari, Mohamed Daoudi, Stefano Berretti, Alberto Del Bimbo:
Generating Multiple 4D Expression Transitions by Learning Face Landmark Trajectories. IEEE Trans. Affect. Comput. 15(2): 566-578 (2024)
- Conference on Computer Vision and Pattern Recognition: Naima Otberdout, Claudio Ferrari, Mohamed Daoudi, Stefano Berretti, Alberto Del Bimbo: Sparse to Dense Dynamic 3D Facial Expression Generation. Conference on Computer Vision and Pattern Recognition (CVPR) 2022: 20353-20362