DS' Daily paper
updated
Instruction Pre-Training: Language Models are Supervised Multitask
Learners
Paper
• 2406.14491
• Published
• 96
Transformers are SSMs: Generalized Models and Efficient Algorithms
Through Structured State Space Duality
Paper
• 2405.21060
• Published
• 68
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small
Reference Models
Paper
• 2405.20541
• Published
• 24
MMLU-Pro: A More Robust and Challenging Multi-Task Language
Understanding Benchmark
Paper
• 2406.01574
• Published
• 51
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback
Paper
• 2406.00888
• Published
• 33
Artificial Generational Intelligence: Cultural Accumulation in
Reinforcement Learning
Paper
• 2406.00392
• Published
• 14
Mobile-Agent-v2: Mobile Device Operation Assistant with Effective
Navigation via Multi-Agent Collaboration
Paper
• 2406.01014
• Published
• 33
Block Transformer: Global-to-Local Language Modeling for Fast Inference
Paper
• 2406.02657
• Published
• 41
Parrot: Multilingual Visual Instruction Tuning
Paper
• 2406.02539
• Published
• 36
Mixture-of-Agents Enhances Large Language Model Capabilities
Paper
• 2406.04692
• Published
• 59
Large Language Model Confidence Estimation via Black-Box Access
Paper
• 2406.04370
• Published
• 22
CRAG -- Comprehensive RAG Benchmark
Paper
• 2406.04744
• Published
• 46
PowerInfer-2: Fast Large Language Model Inference on a Smartphone
Paper
• 2406.06282
• Published
• 39
Paper
• 2406.04127
• Published
• 39
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context
Language Modeling
Paper
• 2406.07522
• Published
• 40
Transformers meet Neural Algorithmic Reasoners
Paper
• 2406.09308
• Published
• 44
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code
Intelligence
Paper
• 2406.11931
• Published
• 69
Bootstrapping Language Models with DPO Implicit Rewards
Paper
• 2406.09760
• Published
• 41
TroL: Traversal of Layers for Large Language and Vision Models
Paper
• 2406.12246
• Published
• 36
VoCo-LLaMA: Towards Vision Compression with Large Language Models
Paper
• 2406.12275
• Published
• 31
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs
Paper
• 2406.15319
• Published
• 64
Judging the Judges: Evaluating Alignment and Vulnerabilities in
LLMs-as-Judges
Paper
• 2406.12624
• Published
• 37
The FineWeb Datasets: Decanting the Web for the Finest Text Data at
Scale
Paper
• 2406.17557
• Published
• 100
Direct Preference Optimization: Your Language Model is Secretly a Reward
Model
Paper
• 2305.18290
• Published
• 64
Scaling Relationship on Learning Mathematical Reasoning with Large
Language Models
Paper
• 2308.01825
• Published
• 23
SLiC-HF: Sequence Likelihood Calibration with Human Feedback
Paper
• 2305.10425
• Published
• 7
RATIONALYST: Pre-training Process-Supervision for Improving Reasoning
Paper
• 2410.01044
• Published
• 35
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation
Generation
Paper
• 2410.23090
• Published
• 55
LLaMo: Large Language Model-based Molecular Graph Assistant
Paper
• 2411.00871
• Published
• 22
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation
Framework
Paper
• 2308.08155
• Published
• 11
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper
• 2303.11366
• Published
• 5
Scaling RL to Long Videos
Paper
• 2507.07966
• Published
• 160
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable
Reinforcement Learning
Paper
• 2507.01006
• Published
• 251
nablaNABLA: Neighborhood Adaptive Block-Level Attention
Paper
• 2507.13546
• Published
• 125