agurung/lcb-ft-v2-qwen3-4b-rft-mixed-24-lora-r128-a32-lr2p5e-4-const-lr2p5e-4-qps8-gpuauto-ep4 Updated 27 days ago • 47
agurung/lcb-ft-v2-qwen3-4b-rft-iid-24-lora-r128-a32-lr2p5e-4-const-lr2p5e-4-qps8-gpuauto-ep4 Updated 27 days ago • 23
agurung/lcb-ft-v2-qwen3-4b-dft-mixed-24-lora-r128-a32-lr2p5e-4-const-lr2p5e-4-qps8-gpuauto-ep4 Updated 27 days ago • 20
agurung/lcb-ft-v2-qwen3-4b-sft-mixed-24-lora-r128-a32-lr2p5e-4-const-lr2p5e-4-qps8-gpuauto-ep4 Updated 27 days ago • 27
agurung/lcb-ft-v2-qwen3-4b-dft-iid-24-lora-r128-a32-lr2p5e-4-const-lr2p5e-4-qps8-gpuauto-ep4 Updated 27 days ago • 25
agurung/lcb-ft-v2-qwen3-4b-sft-iid-24-lora-r128-a32-lr2p5e-4-const-lr2p5e-4-qps8-gpuauto-ep4 Updated 27 days ago • 18
agurung/flawed-fictions-qwen3-4b-lengthpenalty-litereason Reinforcement Learning • 4B • Updated Mar 10 • 2