--- language: en license: mit --- # M-1123_newmodels__olmo7b_sft_r1_ct3arg-rl ## Model Details - **Training Method**: VeRL Reinforcement Learning (RL) - **Stage Name**: rl - **Experiment**: 1123_newmodels__olmo7b_sft_r1_ct3arg - **RL Framework**: VeRL (Versatile Reinforcement Learning) ## Training Configuration ## Experiment Tracking 🔗 **View complete experiment details**: https://huggingface.co/datasets/SkillFactory/D-ExpTracker__1123_newmodels__olmo7b_sft_r1_ct3arg__v1 ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("SkillFactory/M-1123_newmodels__olmo7b_sft_r1_ct3arg-rl") model = AutoModelForCausalLM.from_pretrained("SkillFactory/M-1123_newmodels__olmo7b_sft_r1_ct3arg-rl") ```