Nested Learning: The Illusion of Deep Learning Architectures Paper • 2512.24695 • Published 6 days ago • 27
view article Article Tensor Parallelism (TP) in Transformers: 5 Minutes to Understand Dec 4, 2025 • 63
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Paper • 2501.17703 • Published Jan 29, 2025 • 59
Kandinsky 5.0 Video Lite Collection Kandinsky 5.0 Video Lite is a lightweight 2B model that generates up to 10-second SD videos from English and Russian prompts with high visual quality. • 9 items • Updated 23 days ago • 13
Kandinsky 5.0 Video Lite Diffusers Collection Kandinsky 5.0 Video Lite is a lightweight 2B model that generates up to 10-second SD videos from English and Russian prompts with high visual quality. • 8 items • Updated 23 days ago • 5
Kandinsky 5.0 Video Pro Diffusers Collection Kandinsky 5.0 Video Pro is a 19B model that generates high-quality HD videos from English and Russian prompts with controllable camera motion. • 4 items • Updated 23 days ago • 10
Kandinsky 5.0 Video Pro Collection Kandinsky 5.0 Video Pro is a 19B model that generates high-quality HD videos from English and Russian prompts with controllable camera motion. • 5 items • Updated 23 days ago • 15
Kandinsky 5.0 Image Lite Collection Kandinsky 5.0 Image Lite is a 6B DiT-based model that generates and edits HD images from English and Russian text prompts with high visual quality. • 4 items • Updated 23 days ago • 16
MarsRL: Advancing Multi-Agent Reasoning System via Reinforcement Learning with Agentic Pipeline Parallelism Paper • 2511.11373 • Published Nov 14, 2025 • 12
INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats Paper • 2510.25602 • Published Oct 29, 2025 • 77
Latent Diffusion Model without Variational Autoencoder Paper • 2510.15301 • Published Oct 17, 2025 • 49