Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments Paper • 2605.30280 • Published 4 days ago • 118
ERNIE-Image Collection The serieas of image generation models, including text2img、img2img. • 4 items • Updated 12 days ago • 24
PiD: Fast and High-Resolution Latent Decoding with Pixel Diffusion Paper • 2605.23902 • Published 10 days ago • 44