Personalized Preference Fine-tuning of Diffusion Models Paper • 2501.06655 • Published Jan 11, 2025 • 1
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models Paper • 2502.17387 • Published Feb 24, 2025 • 7
RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems Paper • 2510.02263 • Published Oct 2, 2025 • 9