CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Paper • 2603.24440 • Published 3 days ago • 86
M^4olGen: Multi-Agent, Multi-Stage Molecular Generation under Precise Multi-Property Constraints Paper • 2601.10131 • Published Jan 15 • 17
Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning Paper • 2601.09667 • Published Jan 14 • 92
Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs Paper • 2601.08763 • Published Jan 13 • 150
AutoEnv: Automated Environments for Measuring Cross-Environment Agent Learning Paper • 2511.19304 • Published Nov 24, 2025 • 92
Scaling Latent Reasoning via Looped Language Models Paper • 2510.25741 • Published Oct 29, 2025 • 229
ReCode: Unify Plan and Action for Universal Granularity Control Paper • 2510.23564 • Published Oct 27, 2025 • 123
Improving GUI Grounding with Explicit Position-to-Coordinate Mapping Paper • 2510.03230 • Published Oct 3, 2025 • 4
WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation Paper • 2508.16763 • Published Aug 22, 2025 • 2
Improving Context Fidelity via Native Retrieval-Augmented Reasoning Paper • 2509.13683 • Published Sep 17, 2025 • 8
MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation Paper • 2506.14028 • Published Jun 16, 2025 • 93
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2, 2025 • 190
System-1.5 Reasoning: Traversal in Language and Latent Spaces with Dynamic Shortcuts Paper • 2505.18962 • Published May 25, 2025 • 12
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers Paper • 2505.21497 • Published May 27, 2025 • 109
GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks Paper • 2504.12764 • Published Apr 17, 2025 • 42