Reasoning_Models OpenMMReasoner/OpenMMReasoner-RL Image-Text-to-Text • 8B • Updated Dec 30, 2025 • 20 • 17
Training Tricks ZClip: Adaptive Spike Mitigation for LLM Pre-Training Paper • 2504.02507 • Published Apr 3, 2025 • 89 Variance Control via Weight Rescaling in LLM Pre-training Paper • 2503.17500 • Published Mar 21, 2025 • 5 Running Agents 12 Gpu Tflop Finder 😻 12 Get the TFLOPs for your GPU quickly
ZClip: Adaptive Spike Mitigation for LLM Pre-Training Paper • 2504.02507 • Published Apr 3, 2025 • 89
Variance Control via Weight Rescaling in LLM Pre-training Paper • 2503.17500 • Published Mar 21, 2025 • 5
Image_Classifiers timm/eva_large_patch14_196.in22k_ft_in22k_in1k Image Classification • Updated Jan 21, 2025 • 7.62k • 3 amaye15/microsoft-swinv2-base-patch4-window16-256-batch32-lr0.005-standford-dogs Image Classification • 87M • Updated May 16, 2024 • 3
timm/eva_large_patch14_196.in22k_ft_in22k_in1k Image Classification • Updated Jan 21, 2025 • 7.62k • 3
amaye15/microsoft-swinv2-base-patch4-window16-256-batch32-lr0.005-standford-dogs Image Classification • 87M • Updated May 16, 2024 • 3
Reasoning_Models OpenMMReasoner/OpenMMReasoner-RL Image-Text-to-Text • 8B • Updated Dec 30, 2025 • 20 • 17
Image_Classifiers timm/eva_large_patch14_196.in22k_ft_in22k_in1k Image Classification • Updated Jan 21, 2025 • 7.62k • 3 amaye15/microsoft-swinv2-base-patch4-window16-256-batch32-lr0.005-standford-dogs Image Classification • 87M • Updated May 16, 2024 • 3
timm/eva_large_patch14_196.in22k_ft_in22k_in1k Image Classification • Updated Jan 21, 2025 • 7.62k • 3
amaye15/microsoft-swinv2-base-patch4-window16-256-batch32-lr0.005-standford-dogs Image Classification • 87M • Updated May 16, 2024 • 3
Training Tricks ZClip: Adaptive Spike Mitigation for LLM Pre-Training Paper • 2504.02507 • Published Apr 3, 2025 • 89 Variance Control via Weight Rescaling in LLM Pre-training Paper • 2503.17500 • Published Mar 21, 2025 • 5 Running Agents 12 Gpu Tflop Finder 😻 12 Get the TFLOPs for your GPU quickly
ZClip: Adaptive Spike Mitigation for LLM Pre-Training Paper • 2504.02507 • Published Apr 3, 2025 • 89
Variance Control via Weight Rescaling in LLM Pre-training Paper • 2503.17500 • Published Mar 21, 2025 • 5