Exploration and Exploitation Errors Are Measurable for Language Model Agents Paper • 2604.13151 • Published 10 days ago • 24
Thinking Makes LLM Agents Introverted: How Mandatory Thinking Can Backfire in User-Engaged Agents Paper • 2602.07796 • Published Feb 8 • 7
Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense Paper • 2510.07242 • Published Oct 8, 2025 • 30