Submitted by akhaliq 124 SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training · 9 authors 315 7
Submitted by akhaliq 36 Optimizing Large Language Model Training Using FP4 Quantization · 8 authors 4
Submitted by akhaliq 32 Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling · 7 authors 4
Submitted by paulpanwang 22 DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation · 5 authors 476 3
Submitted by akhaliq 12 Low-Rank Adapters Meet Neural Architecture Search for LLM Compression · 3 authors 73 2
Submitted by akhaliq 7 TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models · 5 authors 120 5
Submitted by amanchadha 7 IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding · 7 authors 2
Submitted by iproskurina 4 Histoires Morales: A French Dataset for Assessing Moral Alignment Laboratoire Hubert Curien 3 2