Hello everyone,
Post-Training Isaac GR00T N1.5, is it possible to train custom robot and custom real world dataset?
Thanks in advance
Hello everyone,
Post-Training Isaac GR00T N1.5, is it possible to train custom robot and custom real world dataset?
Thanks in advance
Seems possible?
You can post-train Isaac GR00T N1.5 on a custom robot using your own real-world dataset. NVIDIA’s public model card states N1.5 is adaptable via post-training; the Hugging Face tutorial shows a complete run on a new embodiment; LeRobot’s docs explain the dataset format, processors, and the GR00T policy integration. (Hugging Face)
embodiment_tag="new_embodiment" so GR00T learns that interface. This is the documented path when your hardware wasn’t in pretraining. (GitHub)meta/ JSON). You can stream from the Hub or load locally. (Hugging Face)nvidia/GR00T-N1.5-3B. Confirm “ready for non-commercial use.” (Hugging Face)meta/ describing schema, FPS, and episode offsets. Supports StreamingLeRobotDataset to train without downloading. (Hugging Face)meta/modality.json to your dataset. Copy an example then edit camera names, state keys, and action dims. In the official tutorial this is Step 1.2. For new robots, set embodiment_tag to new_embodiment. (Hugging Face)scripts/gr00t_finetune.py). It notes ~25 GB VRAM for defaults and shows flags to reduce memory if needed. (Hugging Face)# Fine-tune GR00T N1.5 on your LeRobot v3 dataset
# refs:
# blog: https://huggingface.co/blog/nvidia/gr00t-n1-5-so101-tuning
# repo: https://github.com/NVIDIA/Isaac-GR00T
python scripts/gr00t_finetune.py \
--dataset-path /data/my_robot_v3_dataset \
--num-gpus 1 \
--output-dir ./checkpoints/my_robot_n1p5 \
--max-steps 10000 \
--data-config so100_dualcam \
--video-backend torchvision_av
meta/info.json (schema, fps), meta/stats.json (norm stats), meta/episodes/ (episode offsets), data/ Parquet shards, videos/ per-camera MP4 shards. Episode views are reconstructed from metadata. (Hugging Face)LeRobotDataset(...) for cached local; StreamingLeRobotDataset(...) for on-the-fly training. Both return dicts with keys like observation.images.front, observation.state, action. (Hugging Face)modality.json keys and verify the v3 loader version. (GitHub)action_dim is large; restarts or stabilization recipes help. Monitor loss/grad norms. (GitHub)tip but the example uses wrist, update modality.json and processors accordingly. Community guides show concrete edits. (Zenn)# pip install "lerobot>=0.4.0" # docs: https://huggingface.co/docs/lerobot
# refs:
# dataset: https://huggingface.co/datasets/lerobot/svla_so101_pickplace
# tutorial: https://huggingface.co/blog/nvidia/gr00t-n1-5-so101-tuning
from lerobot.datasets.streaming_dataset import StreamingLeRobotDataset
repo_id = "lerobot/svla_so101_pickplace" # small, clean, GR00T-ready example
ds = StreamingLeRobotDataset(repo_id) # streams from the Hub
sample = ds[0] # dict with observation.*, action
Run the tutorial’s gr00t_finetune.py with your dataset path after you validate keys and shapes. (Hugging Face)
Models
nvidia/GR00T-N1.5-3B — base policy for post-training; model card explicitly mentions post-training support and shows I/O. (Hugging Face)gr00t_n1_5 filter on the Hub. (Hugging Face)Datasets
lerobot/svla_so101_pickplace and lerobot/svla_so100_pickplace — small, proven with the official tutorial and LeRobot loaders. Good for smoke tests. (Hugging Face)lerobot/droid_1.0.1 — large in-the-wild manipulation demos in LeRobot format. Useful for diversity or pretraining. (Hugging Face)Docs you will use repeatedly
If you are data-limited, NVIDIA’s GR00T-Dreams blueprint shows a pipeline to generate large synthetic trajectory sets and mix them with real demos for post-training. (NVIDIA Developer)
meta/modality.json matches your sensors, action dims, and camera names; embodiment_tag="new_embodiment". (Hugging Face)Official + model cards
Step-by-step guides
modality.json, training, eval, deploy, VRAM). (Hugging Face)Datasets
lerobot/svla_so101_pickplace, lerobot/svla_so100_pickplace, lerobot/droid_1.0.1 — ready to load. (Hugging Face)Troubleshooting and pitfalls
Thank you very much for your detail explanation ![]()