PSFT+RL models
SII-Wenhong
wh-zhu
AI & ML interests
None yet
Recent Activity
upvoted a paper 1 day ago
Hybrid Policy Distillation for LLMs submitted a paper 1 day ago
Hybrid Policy Distillation for LLMs new activity 2 days ago
wh-zhu/Qwen2.5-7B-PSFT-RL-DAPO-90:Add model card and metadata