Kaicheng Yang

Kaichengalex

https://kaichengyang0828.github.io/Kaicheng-Yang0828.github.io/

kaichengyang0828

AI & ML interests

Multimodal Representation Learning/ Vision-Language Pretraining/DeepResearch

Recent Activity

upvoted a paper 3 days ago

Near-Future Policy Optimization

authored a paper 9 days ago

UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards

updated a collection 9 days ago

UniDoc-RL

View all activity

Organizations

upvoted a paper 3 days ago

Near-Future Policy Optimization

Paper • 2604.20733 • Published 4 days ago • 63

upvoted a paper 9 days ago

UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards

Paper • 2604.14967 • Published 10 days ago • 15

upvoted a paper 10 days ago

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published 11 days ago • 152

upvoted a paper 18 days ago

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

Paper • 2604.05015 • Published 20 days ago • 234

upvoted a paper 25 days ago

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published 28 days ago • 144

upvoted 3 papers about 2 months ago

upvoted a paper 2 months ago

OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published Feb 9 • 52

upvoted 3 papers 3 months ago

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published Jan 14 • 195

Action100M: A Large-scale Video Action Dataset

Paper • 2601.10592 • Published Jan 15 • 31

DanQing: An Up-to-Date Large-Scale Chinese Vision-Language Pre-training Dataset

Paper • 2601.10305 • Published Jan 15 • 36

upvoted 2 papers 4 months ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published Jan 8 • 230

Latent Implicit Visual Reasoning

Paper • 2512.21218 • Published Dec 24, 2025 • 70

upvoted a collection 4 months ago

Molmo2 Data

Collection

Artifacts for the Molmo2 data release • 13 items • Updated Mar 2 • 39

upvoted 2 papers 4 months ago

HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices

Paper • 2512.14052 • Published Dec 16, 2025 • 42

Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published Dec 15, 2025 • 106

upvoted 3 papers 5 months ago

Qwen3-VL Technical Report

Paper • 2511.21631 • Published Nov 26, 2025 • 162

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published Dec 2, 2025 • 267

InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision

Paper • 2512.01342 • Published Dec 1, 2025 • 19

Kaicheng Yang

AI & ML interests

Recent Activity

Organizations

Kaichengalex's activity