PingchengDong
heisei
AI & ML interests
None yet
Recent Activity
upvoted a paper about 2 months ago
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization liked
a model 3 months ago
nvidia/DLER-R1-7B-Research liked
a model 3 months ago
nvidia/DLER-Llama-Nemotron-8B-Merge-Research Organizations
None yet