TimeBill: Time-Budgeted Inference for Large Language Models Paper • 2512.21859 • Published 6 days ago • 18
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 30 days ago • 242
StreamingVLM: Real-Time Understanding for Infinite Video Streams Paper • 2510.09608 • Published Oct 10, 2025 • 50
POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion Paper • 2509.01215 • Published Sep 1, 2025 • 50
Interpretable and Reliable Detection of AI-Generated Images via Grounded Reasoning in MLLMs Paper • 2506.07045 • Published Jun 8, 2025 • 8
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models Paper • 2506.05176 • Published Jun 5, 2025 • 77
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published Jun 2, 2025 • 147
Kimi-VL-A3B Collection Moonshot's efficient MoE VLMs, exceptional on agent, long-context, and thinking • 7 items • Updated Oct 30, 2025 • 77
Wan: Open and Advanced Large-Scale Video Generative Models Paper • 2503.20314 • Published Mar 26, 2025 • 56
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning Paper • 2503.15558 • Published Mar 18, 2025 • 50
GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing Paper • 2503.10639 • Published Mar 13, 2025 • 53
PipeOffload: Improving Scalability of Pipeline Parallelism with Memory Optimization Paper • 2503.01328 • Published Mar 3, 2025 • 16