admarcosai 's Collections Pending Papers
updated
Video Creation by Demonstration
Paper
• 2412.09551
• Published
• 9
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for
Customized Manga Generation
Paper
• 2412.07589
• Published
• 48
Unraveling the Complexity of Memory in RL Agents: an Approach for
Classification and Evaluation
Paper
• 2412.06531
• Published
• 72
APOLLO: SGD-like Memory, AdamW-level Performance
Paper
• 2412.05270
• Published
• 37
Ultra-Sparse Memory Network
Paper
• 2411.12364
• Published
• 23
Paper
• 2409.07429
• Published
• 32
Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in
Long-Horizon Tasks
Paper
• 2408.03615
• Published
• 31
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge
Bases
Paper
• 2407.12784
• Published
• 51
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for
LLM Agents
Paper
• 2407.04363
• Published
• 34
VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding
Paper
• 2403.11481
• Published
• 13
Evaluating Very Long-Term Conversational Memory of LLM Agents
Paper
• 2402.17753
• Published
• 19
ChatQA: Building GPT-4 Level Conversational QA Models
Paper
• 2401.10225
• Published
• 36
Commonsense-augmented Memory Construction and Management in Long-term
Conversations via Context-aware Persona Refinement
Paper
• 2401.14215
• Published
• 2
Effective and Efficient Conversation Retrieval for Dialogue State
Tracking with Implicit Text Summaries
Paper
• 2402.13043
• Published
• 2
Evaluating and Aligning CodeLLMs on Human Preference
Paper
• 2412.05210
• Published
• 50
UniReal: Universal Image Generation and Editing via Learning Real-world
Dynamics
Paper
• 2412.07774
• Published
• 30
Paper
• 2412.07724
• Published
• 18
Fully Open Source Moxin-7B Technical Report
Paper
• 2412.06845
• Published
• 11
Training Large Language Models to Reason in a Continuous Latent Space
Paper
• 2412.06769
• Published
• 94
ProcessBench: Identifying Process Errors in Mathematical Reasoning
Paper
• 2412.06559
• Published
• 86
Maya: An Instruction Finetuned Multilingual Multimodal Model
Paper
• 2412.07112
• Published
• 28
Expanding Performance Boundaries of Open-Source Multimodal Models with
Model, Data, and Test-Time Scaling
Paper
• 2412.05271
• Published
• 160
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases
Paper
• 2412.04862
• Published
• 50
Evaluating Language Models as Synthetic Data Generators
Paper
• 2412.03679
• Published
• 47
Code-as-Monitor: Constraint-aware Visual Programming for Reactive and
Proactive Robotic Failure Detection
Paper
• 2412.04455
• Published
• 38
CodeChain: Towards Modular Code Generation Through Chain of
Self-revisions with Representative Sub-modules
Paper
• 2310.08992
• Published
• 12
Paper
• 2412.04315
• Published
• 19
Discriminative Fine-tuning of LVLMs
Paper
• 2412.04378
• Published
• 10
MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation
Paper
• 2412.04448
• Published
• 10
PaliGemma 2: A Family of Versatile VLMs for Transfer
Paper
• 2412.03555
• Published
• 133
Surveying the Effects of Quality, Diversity, and Complexity in Synthetic
Data From Large Language Models
Paper
• 2412.02980
• Published
• 15
Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training
in LLMs
Paper
• 2411.08719
• Published
• 1
Little Giants: Synthesizing High-Quality Embedding Data at Scale
Paper
• 2410.18634
• Published
A Survey on Data Synthesis and Augmentation for Large Language Models
Paper
• 2410.12896
• Published
• 1
Self-Improvement in Language Models: The Sharpening Mechanism
Paper
• 2412.01951
• Published
Large Language Models Can Self-Improve in Long-context Reasoning
Paper
• 2411.08147
• Published
• 65
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's
Reasoning Capability
Paper
• 2411.19943
• Published
• 62
MALT: Improving Reasoning with Multi-Agent LLM Training
Paper
• 2412.01928
• Published
• 45
Multi-Agent Large Language Models for Conversational Task-Solving
Paper
• 2410.22932
• Published
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and
Pruning
Paper
• 2412.03248
• Published
• 26
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on
Retrieval-Augmented Generation
Paper
• 2412.02592
• Published
• 24
Scaling Image Tokenizers with Grouped Spherical Quantization
Paper
• 2412.02632
• Published
• 10
X-Prompt: Towards Universal In-Context Image Generation in
Auto-Regressive Vision Language Foundation Models
Paper
• 2412.01824
• Published
• 64
o1-Coder: an o1 Replication for Coding
Paper
• 2412.00154
• Published
• 44
Open-Sora Plan: Open-Source Large Video Generation Model
Paper
• 2412.00131
• Published
• 33
The Well: a Large-Scale Collection of Diverse Physics Simulations for
Machine Learning
Paper
• 2412.00568
• Published
• 24
PhysGame: Uncovering Physical Commonsense Violations in Gameplay Videos
Paper
• 2412.01800
• Published
• 6
A Simple and Provable Scaling Law for the Test-Time Compute of Large
Language Models
Paper
• 2411.19477
• Published
• 6
Exploring the Abilities of Large Language Models to Solve Proportional
Analogies via Knowledge-Enhanced Prompting
Paper
• 2412.00869
• Published
• 4
World-consistent Video Diffusion with Explicit 3D Modeling
Paper
• 2412.01821
• Published
• 4
Yi-Lightning Technical Report
Paper
• 2412.01253
• Published
• 28
Reverse Thinking Makes LLMs Stronger Reasoners
Paper
• 2411.19865
• Published
• 23
AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering
Benchmark Dataset
Paper
• 2411.15640
• Published
• 5
Large Language Model-Brained GUI Agents: A Survey
Paper
• 2411.18279
• Published
• 30
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for
Quantized LLMs with 100T Training Tokens
Paper
• 2411.17691
• Published
• 13
Learning 3D Representations from Procedural 3D Programs
Paper
• 2411.17467
• Published
• 9
Star Attention: Efficient LLM Inference over Long Sequences
Paper
• 2411.17116
• Published
• 53
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple
Distillation, Big Progress or Bitter Lesson?
Paper
• 2411.16489
• Published
• 45
From Generation to Judgment: Opportunities and Challenges of
LLM-as-a-judge
Paper
• 2411.16594
• Published
• 39
Reflections from the 2024 Large Language Model (LLM) Hackathon for
Applications in Materials Science and Chemistry
Paper
• 2411.15221
• Published
• 30
All Languages Matter: Evaluating LMMs on Culturally Diverse 100
Languages
Paper
• 2411.16508
• Published
• 10
Best of Both Worlds: Advantages of Hybrid Graph Sequence Models
Paper
• 2411.15671
• Published
• 8
LLMs Do Not Think Step-by-step In Implicit Reasoning
Paper
• 2411.15862
• Published
• 9
Predicting Emergent Capabilities by Finetuning
Paper
• 2411.16035
• Published
• 7
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training
Paper
• 2411.15124
• Published
• 67
A Flexible Large Language Models Guardrail Development Methodology
Applied to Off-Topic Prompt Detection
Paper
• 2411.12946
• Published
• 22
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Paper
• 2411.13543
• Published
• 19
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions
Paper
• 2411.14405
• Published
• 61
OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented
LMs
Paper
• 2411.14199
• Published
• 34
Hymba: A Hybrid-head Architecture for Small Language Models
Paper
• 2411.13676
• Published
• 47
Do I Know This Entity? Knowledge Awareness and Hallucinations in
Language Models
Paper
• 2411.14257
• Published
• 14
Patience Is The Key to Large Language Model Reasoning
Paper
• 2411.13082
• Published
• 7
VBench++: Comprehensive and Versatile Benchmark Suite for Video
Generative Models
Paper
• 2411.13503
• Published
• 34
SageAttention2 Technical Report: Accurate 4 Bit Attention for
Plug-and-play Inference Acceleration
Paper
• 2411.10958
• Published
• 57
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context
Training
Paper
• 2411.13476
• Published
• 16
Continuous Speculative Decoding for Autoregressive Image Generation
Paper
• 2411.11925
• Published
• 16
Building Trust: Foundations of Security, Safety and Transparency in AI
Paper
• 2411.12275
• Published
• 11
Evaluating Tokenizer Performance of Large Language Models Across
Official Indian Languages
Paper
• 2411.12240
• Published
• 7
Generative World Explorer
Paper
• 2411.11844
• Published
• 77
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large
Language Models on Mobile Devices
Paper
• 2411.10640
• Published
• 46
Search, Verify and Feedback: Towards Next Generation Post-training
Paradigm of Foundation Models via Verifier Engineering
Paper
• 2411.11504
• Published
• 24
Drowning in Documents: Consequences of Scaling Reranker Inference
Paper
• 2411.11767
• Published
• 19
Comprehensive and Practical Evaluation of Retrieval-Augmented Generation
Systems for Medical Question Answering
Paper
• 2411.09213
• Published
• 7
Evaluating the role of `Constitutions' for learning from AI feedback
Paper
• 2411.10168
• Published
• 5
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer
Use
Paper
• 2411.10323
• Published
• 34
LLaVA-o1: Let Vision Language Models Reason Step-by-Step
Paper
• 2411.10440
• Published
• 129
LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models
Paper
• 2411.09595
• Published
• 77
Hardware and Software Platform Inference
Paper
• 2411.05197
• Published
• 4
Stronger Models are NOT Stronger Teachers for Instruction Tuning
Paper
• 2411.07133
• Published
• 38
Scaling Properties of Diffusion Models for Perceptual Tasks
Paper
• 2411.08034
• Published
• 13
GRS-QA -- Graph Reasoning-Structured Question Answering Dataset
Paper
• 2411.00369
• Published
• 7
GPT or BERT: why not both?
Paper
• 2410.24159
• Published
• 14
Graph-constrained Reasoning: Faithful Reasoning on Knowledge Graphs with
Large Language Models
Paper
• 2410.13080
• Published
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A
Gradient Perspective
Paper
• 2410.23743
• Published
• 64
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM
Inference
Paper
• 2410.21465
• Published
• 11
RARe: Retrieval Augmented Retrieval with In-Context Examples
Paper
• 2410.20088
• Published
• 4
Autoregressive Models in Vision: A Survey
Paper
• 2411.05902
• Published
• 19
Game-theoretic LLM: Agent Workflow for Negotiation Games
Paper
• 2411.05990
• Published
• 9
Language Models are Hidden Reasoners: Unlocking Latent Reasoning
Capabilities via Self-Rewarding
Paper
• 2411.04282
• Published
• 37
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Paper
• 2411.04905
• Published
• 127
BitNet a4.8: 4-bit Activations for 1-bit LLMs
Paper
• 2411.04965
• Published
• 69
Mixture-of-Transformers: A Sparse and Scalable Architecture for
Multi-Modal Foundation Models
Paper
• 2411.04996
• Published
• 50
Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large
Language Model
Paper
• 2411.04496
• Published
• 22
Self-Consistency Preference Optimization
Paper
• 2411.04109
• Published
• 19
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle
Grandmaster Level
Paper
• 2411.03562
• Published
• 69
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM
Quantization
Paper
• 2411.02355
• Published
• 51
How Far is Video Generation from World Model: A Physical Law Perspective
Paper
• 2411.02385
• Published
• 34
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated
Parameters by Tencent
Paper
• 2411.02265
• Published
• 25
Adaptive Caching for Faster Video Generation with Diffusion Transformers
Paper
• 2411.02397
• Published
• 23
Constrained Diffusion Implicit Models
Paper
• 2411.00359
• Published
• 6
Swan and ArabicMTEB: Dialect-Aware, Arabic-Centric, Cross-Lingual, and
Cross-Cultural Embedding Models and Benchmarks
Paper
• 2411.01192
• Published
• 5
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper
• 2410.23218
• Published
• 49
Personalization of Large Language Models: A Survey
Paper
• 2411.00027
• Published
• 33
Survey of User Interface Design and Interaction Techniques in Generative
AI Applications
Paper
• 2410.22370
• Published
• 12
BitStack: Fine-Grained Size Control for Compressed Large Language Models
in Variable Memory Environments
Paper
• 2410.23918
• Published
• 21
SelfCodeAlign: Self-Alignment for Code Generation
Paper
• 2410.24198
• Published
• 24
AAAR-1.0: Assessing AI's Potential to Assist Research
Paper
• 2410.22394
• Published
• 16
On Memorization of Large Language Models in Logical Reasoning
Paper
• 2410.23123
• Published
• 18
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science
Competitions
Paper
• 2410.20424
• Published
• 40
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World
Exploration, Feedback and Optimization
Paper
• 2410.19609
• Published
• 18
A Survey of Small Language Models
Paper
• 2410.20011
• Published
• 46
AgentStore: Scalable Integration of Heterogeneous Agents As Specialized
Generalist Computer Assistant
Paper
• 2410.18603
• Published
• 32
Counting Ability of Large Language Models and Impact of Tokenization
Paper
• 2410.19730
• Published
• 11
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for
Contrastive Loss
Paper
• 2410.17243
• Published
• 92
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis
from Scratch
Paper
• 2410.18693
• Published
• 42
Unbounded: A Generative Infinite Game of Character Life Simulation
Paper
• 2410.18975
• Published
• 37
Multi-Draft Speculative Sampling: Canonical Architectures and
Theoretical Limits
Paper
• 2410.18234
• Published
• 4
WorldSimBench: Towards Video Generation Models as World Simulators
Paper
• 2410.18072
• Published
• 19
LCM-LoRA: A Universal Stable-Diffusion Acceleration Module
Paper
• 2311.05556
• Published
• 87
Latent Consistency Models: Synthesizing High-Resolution Images with
Few-Step Inference
Paper
• 2310.04378
• Published
• 22
Conditional Diffusion Distillation
Paper
• 2310.01407
• Published
• 20
Aligning Text-to-Image Diffusion Models with Reward Backpropagation
Paper
• 2310.03739
• Published
• 22
Large Concept Models: Language Modeling in a Sentence Representation
Space
Paper
• 2412.08821
• Published
• 17
The Role of Summarization in Generative Agents: A Preliminary
Perspective
Paper
• 2305.01253
• Published
• 1
Generative Agents: Interactive Simulacra of Human Behavior
Paper
• 2304.03442
• Published
• 15
SOTOPIA-π: Interactive Learning of Socially Intelligent Language
Agents
Paper
• 2403.08715
• Published
• 21
Generative Agent Simulations of 1,000 People
Paper
• 2411.10109
• Published
• 5
Self-Rewarding Language Models
Paper
• 2401.10020
• Published
• 152
DRLC: Reinforcement Learning with Dense Rewards from LLM Critic
Paper
• 2401.07382
• Published
• 2
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper
• 2401.06080
• Published
• 28
Is this the real life? Is this just fantasy? The Misleading Success of
Simulating Social Interactions With LLMs
Paper
• 2403.05020
• Published
• 2
Improving Reinforcement Learning from Human Feedback Using Contrastive
Rewards
Paper
• 2403.07708
• Published
Large Language Model-based Human-Agent Collaboration for Complex Task
Solving
Paper
• 2402.12914
• Published
Interactive Agents: Simulating Counselor-Client Psychological Counseling
via Role-Playing LLM-to-LLM Interactions
Paper
• 2408.15787
• Published
• 2
Building Cooperative Embodied Agents Modularly with Large Language
Models
Paper
• 2307.02485
• Published
• 12
Natural Language Reinforcement Learning
Paper
• 2411.14251
• Published
• 31
Challenges in Human-Agent Communication
Paper
• 2412.10380
• Published
• 2
From Individual to Society: A Survey on Social Simulation Driven by
Large Language Model-based Agents
Paper
• 2412.03563
• Published
• 1
AgentSense: Benchmarking Social Intelligence of Language Agents through
Interactive Scenarios
Paper
• 2410.19346
• Published
ReSpAct: Harmonizing Reasoning, Speaking, and Acting Towards Building
Large Language Model-Based Conversational AI Agents
Paper
• 2411.00927
• Published
• 2
Simulating User Agents for Embodied Conversational-AI
Paper
• 2410.23535
• Published
• 1
Positive Experience Reflection for Agents in Interactive Text
Environments
Paper
• 2411.02223
• Published
From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons
Paper
• 2412.08442
• Published
OpenDevin: An Open Platform for AI Software Developers as Generalist
Agents
Paper
• 2407.16741
• Published
• 76
Agentless: Demystifying LLM-based Software Engineering Agents
Paper
• 2407.01489
• Published
• 65
Scaling Instructable Agents Across Many Simulated Worlds
Paper
• 2404.10179
• Published
• 28
CodeNav: Beyond tool-use to using real-world codebases with LLM agents
Paper
• 2406.12276
• Published
Code Agents are State of the Art Software Testers
Paper
• 2406.12952
• Published
• 1
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks
at Scale
Paper
• 2409.16299
• Published
• 11
Reinforcement Learning: An Overview
Paper
• 2412.05265
• Published
• 8
Automated Reinforcement Learning: An Overview
Paper
• 2201.05000
• Published
3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D
Scene Understanding
Paper
• 2412.18450
• Published
• 36
3D Scene Graph Guided Vision-Language Pre-training
Paper
• 2411.18666
• Published
Fourier Position Embedding: Enhancing Attention's Periodic Extension for
Length Generalization
Paper
• 2412.17739
• Published
• 41
GIRAFFE: Design Choices for Extending the Context Length of Visual
Language Models
Paper
• 2412.12735
• Published
Breaking the Stage Barrier: A Novel Single-Stage Approach to Long
Context Extension for Large Language Models
Paper
• 2412.07171
• Published
• 1
In Case You Missed It: ARC 'Challenge' Is Not That Challenging
Paper
• 2412.17758
• Published
• 17
Data Laundering: Artificially Boosting Benchmark Results through
Knowledge Distillation
Paper
• 2412.15255
• Published
• 4
Rethinking Thinking Tokens: Understanding Why They Underperform in
Practice
Paper
• 2411.11371
• Published
Are Your LLMs Capable of Stable Reasoning?
Paper
• 2412.13147
• Published
• 93
A NotSo Simple Way to Beat Simple Bench
Paper
• 2412.12173
• Published
Are You Doubtful? Oh, It Might Be Difficult Then! Exploring the Use of
Model Uncertainty for Question Difficulty Estimation
Paper
• 2412.11831
• Published
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing
Paper
• 2412.14711
• Published
• 16
A Survey on Inference Optimization Techniques for Mixture of Experts
Models
Paper
• 2412.14219
• Published
PERFT: Parameter-Efficient Routed Fine-Tuning for Mixture-of-Expert
Model
Paper
• 2411.08212
• Published
RobustFT: Robust Supervised Fine-tuning for Large Language Models under
Noisy Response
Paper
• 2412.14922
• Published
• 88
B-STaR: Monitoring and Balancing Exploration and Exploitation in
Self-Taught Reasoners
Paper
• 2412.17256
• Published
• 47
Deliberation in Latent Space via Differentiable Cache Augmentation
Paper
• 2412.17747
• Published
• 32
Diving into Self-Evolving Training for Multimodal Reasoning
Paper
• 2412.17451
• Published
• 42
Paper
• 2412.16720
• Published
• 37
Revisiting In-Context Learning with Long Context Language Models
Paper
• 2412.16926
• Published
• 32
LearnLM: Improving Gemini for Learning
Paper
• 2412.16429
• Published
• 22
Outcome-Refining Process Supervision for Code Generation
Paper
• 2412.15118
• Published
• 19
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought
Paper
• 2412.17498
• Published
• 22
PC Agent: While You Sleep, AI Works -- A Cognitive Journey into Digital
World
Paper
• 2412.17589
• Published
• 14
A Survey on Human-Centric LLMs
Paper
• 2411.14491
• Published
Agent-SafetyBench: Evaluating the Safety of LLM Agents
Paper
• 2412.14470
• Published
• 13
NILE: Internal Consistency Alignment in Large Language Models
Paper
• 2412.16686
• Published
• 8
Parallelized Autoregressive Visual Generation
Paper
• 2412.15119
• Published
• 53
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper
• 2412.16145
• Published
• 38
SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation
Paper
• 2412.13649
• Published
• 21
DynamicKV: Task-Aware Adaptive KV Cache Compression for Long Context
LLMs
Paper
• 2412.14838
• Published
SCBench: A KV Cache-Centric Analysis of Long-Context Methods
Paper
• 2412.10319
• Published
• 11
Progressive Multimodal Reasoning via Active Retrieval
Paper
• 2412.14835
• Published
• 73
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic
Long-context Multitasks
Paper
• 2412.15204
• Published
• 38
Token-Budget-Aware LLM Reasoning
Paper
• 2412.18547
• Published
• 46
MMFactory: A Universal Solution Search Engine for Vision-Language Tasks
Paper
• 2412.18072
• Published
• 18
A Silver Bullet or a Compromise for Full Attention? A Comprehensive
Study of Gist Token-based Context Compression
Paper
• 2412.17483
• Published
• 34
CypherBench: Towards Precise Retrieval over Full-scale Modern Knowledge
Graphs in the LLM Era
Paper
• 2412.18702
• Published
• 8
Task Preference Optimization: Improving Multimodal Large Language Models
with Vision Task Alignment
Paper
• 2412.19326
• Published
• 18
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Paper
• 2412.18619
• Published
• 60
Paper
• 2412.18653
• Published
• 86
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
Paper
• 2412.18925
• Published
• 107
On the Compositional Generalization of Multimodal LLMs for Medical
Imaging
Paper
• 2412.20070
• Published
• 42
2.5 Years in Class: A Multimodal Textbook for Vision-Language
Pretraining
Paper
• 2501.00958
• Published
• 109
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion
Control
Paper
• 2501.01427
• Published
• 53
CodeElo: Benchmarking Competition-level Code Generation of LLMs with
Human-comparable Elo Ratings
Paper
• 2501.01257
• Published
• 51
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent
Diffusion Models
Paper
• 2501.01423
• Published
• 44
ProgCo: Program Helps Self-Correction of Large Language Models
Paper
• 2501.01264
• Published
• 26
Unifying Specialized Visual Encoders for Video Language Models
Paper
• 2501.01426
• Published
• 20
Dynamic Scaling of Unit Tests for Code Reward Modeling
Paper
• 2501.01054
• Published
• 16
Understanding and Mitigating Bottlenecks of State Space Models through
the Lens of Recency and Over-smoothing
Paper
• 2501.00658
• Published
• 7
Attamba: Attending To Multi-Token States
Paper
• 2411.17685
• Published
Xmodel-2 Technical Report
Paper
• 2412.19638
• Published
• 27
Deep Learning-based Approaches for State Space Models: A Selective
Review
Paper
• 2412.11211
• Published
On the Expressiveness and Length Generalization of Selective State-Space
Models on Regular Languages
Paper
• 2412.19350
• Published
Test-time Computing: from System-1 Thinking to System-2 Thinking
Paper
• 2501.02497
• Published
• 45
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models
in Multi-Hop Tool Use
Paper
• 2501.02506
• Published
• 10
Scaling Laws for Floating Point Quantization Training
Paper
• 2501.02423
• Published
• 26
Graph Generative Pre-trained Transformer
Paper
• 2501.01073
• Published
• 18
Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive
Experiments, Analysis, and Improvements
Paper
• 2501.00773
• Published
• 1
Personalized Graph-Based Retrieval for Large Language Models
Paper
• 2501.02157
• Published
• 31
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal
Sampling
Paper
• 2408.16737
• Published
• 1
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model
Paper
• 2410.13639
• Published
• 19
Reinforcement Learning Enhanced LLMs: A Survey
Paper
• 2412.10400
• Published
Personalized Audiobook Recommendations at Spotify Through Graph Neural
Networks
Paper
• 2403.05185
• Published
• 23
Dynamic graph neural networks for enhanced volatility prediction in
financial markets
Paper
• 2410.16858
• Published
• 1
Cooperative Graph Neural Networks
Paper
• 2310.01267
• Published
• 1
A Survey on Graph Neural Networks for Time Series: Forecasting,
Classification, Imputation, and Anomaly Detection
Paper
• 2307.03759
• Published
• 1
Spatio-Temporal Graph Neural Networks: A Survey
Paper
• 2301.10569
• Published
• 1
A Survey of Graph Neural Networks for Social Recommender Systems
Paper
• 2212.04481
• Published
• 1
Multi-Reranker: Maximizing performance of retrieval-augmented generation
in the FinanceRAG challenge
Paper
• 2411.16732
• Published
• 1
FinGen: A Dataset for Argument Generation in Finance
Paper
• 2405.20708
• Published
'Finance Wizard' at the FinLLM Challenge Task: Financial Text
Summarization
Paper
• 2408.03762
• Published
• 1
MME-Finance: A Multimodal Finance Benchmark for Expert-level
Understanding and Reasoning
Paper
• 2411.03314
• Published
• 1
A Survey of Large Language Models in Finance (FinLLMs)
Paper
• 2402.02315
• Published
• 1
FinTral: A Family of GPT-4 Level Multimodal Financial Large Language
Models
Paper
• 2402.10986
• Published
• 81
GeAR: Generation Augmented Retrieval
Paper
• 2501.02772
• Published
• 21
Agent Laboratory: Using LLM Agents as Research Assistants
Paper
• 2501.04227
• Published
• 95
LLM4SR: A Survey on Large Language Models for Scientific Research
Paper
• 2501.04306
• Published
• 35
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta
Chain-of-Though
Paper
• 2501.04682
• Published
• 99
URSA: Understanding and Verifying Chain-of-thought Reasoning in
Multimodal Mathematics
Paper
• 2501.04686
• Published
• 53
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep
Thinking
Paper
• 2501.04519
• Published
• 288
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language
Models
Paper
• 2501.03262
• Published
• 104
Cosmos World Foundation Model Platform for Physical AI
Paper
• 2501.03575
• Published
• 82
DeepSeek-V3 Technical Report
Paper
• 2412.19437
• Published
• 76
Synthetic Vision: Training Vision-Language Models to Understand Physics
Paper
• 2412.08619
• Published
Large Action Models: From Inception to Implementation
Paper
• 2412.10047
• Published
• 36
MotionBench: Benchmarking and Improving Fine-grained Video Motion
Understanding for Vision Language Models
Paper
• 2501.02955
• Published
• 44
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides
Paper
• 2501.03936
• Published
• 23
Dolphin: Closed-loop Open-ended Auto-research through Thinking,
Practice, and Feedback
Paper
• 2501.03916
• Published
• 16
Graph-Aware Isomorphic Attention for Adaptive Dynamics in Transformers
Paper
• 2501.02393
• Published
• 7
VideoRAG: Retrieval-Augmented Generation over Video Corpus
Paper
• 2501.05874
• Published
• 75
Infecting Generative AI With Viruses
Paper
• 2501.05542
• Published
• 13
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
Paper
• 2501.05707
• Published
• 20
Demystifying Domain-adaptive Post-training for Financial LLMs
Paper
• 2501.04961
• Published
• 11
Enhancing Human-Like Responses in Large Language Models
Paper
• 2501.05032
• Published
• 61
An Empirical Study of Autoregressive Pre-training from Videos
Paper
• 2501.05453
• Published
• 41
The Lessons of Developing Process Reward Models in Mathematical
Reasoning
Paper
• 2501.07301
• Published
• 100
Tensor Product Attention Is All You Need
Paper
• 2501.06425
• Published
• 90
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical
Reasoning
Paper
• 2501.06458
• Published
• 31
O1 Replication Journey: A Strategic Progress Report -- Part 1
Paper
• 2410.18982
• Published
• 3
Transformer^2: Self-adaptive LLMs
Paper
• 2501.06252
• Published
• 55
VideoAuteur: Towards Long Narrative Video Generation
Paper
• 2501.06173
• Published
• 31
Towards Best Practices for Open Datasets for LLM Training
Paper
• 2501.08365
• Published
• 62
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents
Paper
• 2501.08828
• Published
• 30
Re-ranking the Context for Multimodal Retrieval Augmented Generation
Paper
• 2501.04695
• Published
• 1
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities
Paper
• 2501.08983
• Published
• 22
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper
• 2501.08313
• Published
• 300
AfriHate: A Multilingual Collection of Hate Speech and Abusive Language
Datasets for African Languages
Paper
• 2501.08284
• Published
• 7
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising
Steps
Paper
• 2501.09732
• Published
• 72
OmniThink: Expanding Knowledge Boundaries in Machine Writing through
Thinking
Paper
• 2501.09751
• Published
• 46
Learnings from Scaling Visual Tokenizers for Reconstruction and
Generation
Paper
• 2501.09755
• Published
• 35
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
Paper
• 2501.09686
• Published
• 41
Do generative video models learn physical principles from watching
videos?
Paper
• 2501.09038
• Published
• 34
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling
under Long-Context Scenario
Paper
• 2501.10132
• Published
• 22
PaSa: An LLM Agent for Comprehensive Academic Paper Search
Paper
• 2501.10120
• Published
• 54
Evolving Deeper LLM Thinking
Paper
• 2501.09891
• Published
• 115
FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in
Virtual 3D Spaces
Paper
• 2501.12909
• Published
• 74
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via
Reinforcement Learning
Paper
• 2501.12948
• Published
• 440
Autonomy-of-Experts Models
Paper
• 2501.13074
• Published
• 44
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Paper
• 2501.12599
• Published
• 126
Agent-R: Training Language Model Agents to Reflect via Iterative
Self-Training
Paper
• 2501.11425
• Published
• 109
UI-TARS: Pioneering Automated GUI Interaction with Native Agents
Paper
• 2501.12326
• Published
• 64
TokenVerse: Versatile Multi-concept Personalization in Token Modulation
Space
Paper
• 2501.12224
• Published
• 48
Reasoning Language Models: A Blueprint
Paper
• 2501.11223
• Published
• 33
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D
Assets Generation
Paper
• 2501.12202
• Published
• 49
Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in
Realistic Environments
Paper
• 2501.10893
• Published
• 26
The Geometry of Tokens in Internal Representations of Large Language
Models
Paper
• 2501.10573
• Published
• 9
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos
Paper
• 2501.09781
• Published
• 27
GameFactory: Creating New Games with Generative Interactive Videos
Paper
• 2501.08325
• Published
• 67
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding
Paper
• 2501.13200
• Published
• 69
Sigma: Differential Rescaling of Query, Key and Value for Efficient
Language Models
Paper
• 2501.13629
• Published
• 48
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model
Post-training
Paper
• 2501.17161
• Published
• 124
Optimizing Large Language Model Training Using FP4 Quantization
Paper
• 2501.17116
• Published
• 36
Open Problems in Mechanistic Interpretability
Paper
• 2501.16496
• Published
• 22
Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling
Paper
• 2501.16975
• Published
• 32
Baichuan-Omni-1.5 Technical Report
Paper
• 2501.15368
• Published
• 60
Qwen2.5-1M Technical Report
Paper
• 2501.15383
• Published
• 72
Towards General-Purpose Model-Free Reinforcement Learning
Paper
• 2501.16142
• Published
• 31
CodeMonkeys: Scaling Test-Time Compute for Software Engineering
Paper
• 2501.14723
• Published
• 10
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for
Mixture-of-Experts Language Models
Paper
• 2501.12370
• Published
• 11
Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with
Modality-Aware Sparsity
Paper
• 2501.16295
• Published
• 8
OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale
Synthetic Personas
Paper
• 2501.15427
• Published
• 6
Return of the Encoder: Maximizing Parameter Efficiency for SLMs
Paper
• 2501.16273
• Published
• 5
Chain-of-Retrieval Augmented Generation
Paper
• 2501.14342
• Published
• 58
RealCritic: Towards Effectiveness-Driven Evaluation of Language Model
Critiques
Paper
• 2501.14492
• Published
• 27
RL + Transformer = A General-Purpose Problem Solver
Paper
• 2501.14176
• Published
• 28
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
Paper
• 2501.18585
• Published
• 61