Cross-Modal Emotion Transfer for Emotion Editing in Talking Face Video Paper • 2604.07786 • Published 12 days ago • 6
HandVQA: Diagnosing and Improving Fine-Grained Spatial Reasoning about Hands in Vision-Language Models Paper • 2603.26362 • Published 24 days ago
VPOcc: Exploiting Vanishing Point for 3D Semantic Occupancy Prediction Paper • 2408.03551 • Published Aug 7, 2024