zombieofCrypto 's Collections llm_improvement_research
updated
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via
Reinforcement Learning
Paper
• 2501.12948
• Published
• 441
LightThinker: Thinking Step-by-Step Compression
Paper
• 2502.15589
• Published
• 31
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts
Language Model
Paper
• 2405.04434
• Published
• 25
Model Compression and Efficient Inference for Large Language Models: A
Survey
Paper
• 2402.09748
• Published
• 2
Efficient Transformers: A Survey
Paper
• 2009.06732
• Published
• 1
Keyformer: KV Cache Reduction through Key Tokens Selection for Efficient
Generative Inference
Paper
• 2403.09054
• Published
• 1
FastCache: Optimizing Multimodal LLM Serving through Lightweight
KV-Cache Compression Framework
Paper
• 2503.08461
• Published
Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge
Reasoning
Paper
• 2503.04973
• Published
• 26