VLA-RL-Study: What Can RL Bring to VLA Generalization? An Empirical Study
This is the RL model, fine-tuned from the warm-upped OpenVLA model.
The RL training takes about 1.5M environment steps.
For more details, please refer to the codebase and the paper.