HeartMuLa: A Family of Open Sourced Music Foundation Models Paper • 2601.10547 • Published 8 days ago • 34
ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands Paper • 2512.24965 • Published 23 days ago • 41
NitroGen: An Open Foundation Model for Generalist Gaming Agents Paper • 2601.02427 • Published 19 days ago • 42
MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization Paper • 2601.01554 • Published 19 days ago • 54
InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields Paper • 2601.03252 • Published 17 days ago • 98
LTX-2: Efficient Joint Audio-Visual Foundation Model Paper • 2601.03233 • Published 17 days ago • 132
Running Featured 47 MOSS Transcribe Diarize 🏢 47 Transcribe audio/video files with speaker identification
Bitnet.cpp: Efficient Edge Inference for Ternary LLMs Paper • 2502.11880 • Published Feb 17, 2025 • 4
NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos Paper • 2601.00393 • Published 22 days ago • 123
Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization Paper • 2512.24615 • Published 23 days ago • 116
WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling Paper • 2512.14614 • Published Dec 16, 2025 • 70