VINO: A Unified Visual Generator with Interleaved OmniModal Context Paper • 2601.02358 • Published 5 days ago • 28
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling Paper • 2509.12201 • Published Sep 15, 2025 • 106
π^3: Scalable Permutation-Equivariant Visual Geometry Learning Paper • 2507.13347 • Published Jul 17, 2025 • 65
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation Paper • 2410.08208 • Published Oct 10, 2024
VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers Paper • 2507.01016 • Published Jul 1, 2025 • 1
VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers Paper • 2507.01016 • Published Jul 1, 2025 • 1
OpenX-LeRobot Collection Open X-Embodiment datasets in LeRobot format with standard transfomation (https://github.com/Tavish9/any4lerobot) • 34 items • Updated Aug 28, 2025 • 28