7 28 37

Manan Shah

cs-mshah

https://cs-mshah.github.io/

AI & ML interests

Computer Vision

Recent Activity

upvoted a paper about 9 hours ago

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

upvoted an article 3 days ago

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

liked a model 4 days ago

Qwen/Qwen-Image-Edit-2511

View all activity

Organizations

upvoted a paper about 9 hours ago

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

Paper • 2512.20557 • Published 4 days ago • 46

upvoted an article 3 days ago

Article

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

Jun 3

•

299

liked a model 4 days ago

Qwen/Qwen-Image-Edit-2511

Image-to-Image • Updated 5 days ago • 14.5k • • 454

liked a model 10 days ago

browser-use/bu-30b-a3b-preview

Image-Text-to-Text • 31B • Updated 4 days ago • 4.81k • 211

liked a model 12 days ago

lokiz666/Realgen-detection-models

Text-to-Image • Updated 17 days ago • 10 • 15

liked a Space 14 days ago

WindowSeat Reflection Removal Web

🪟

Remove reflections from images easily

upvoted an article 22 days ago

Article

We Got Claude to Fine-Tune an Open Source LLM

24 days ago

•

540

liked 2 models 24 days ago

oumoumad/Qwen-Edit-2509-Material-transfer

Image-to-Image • Updated 25 days ago • 32

oumoumad/Qwen-Edit-2509-Extract-materials

Updated 25 days ago • 9

upvoted an article about 1 month ago

Article

Continuous batching from first principles

Nov 25

•

286

updated a dataset about 1 month ago

VLR16824/vlr_data

Updated 27 days ago • 9

upvoted a collection about 1 month ago

MetaCLIP2 Multilingual

Collection

8 items • Updated Nov 12 • 16

liked 2 models about 1 month ago

dx8152/Qwen-Image-Edit-2509-White_to_Scene

Image-to-Image • Updated Nov 12 • 3.12k • 116

eigen-ai-labs/eigen-banana-qwen-image-edit

Text-to-Image • Updated Nov 16 • 96 • 250

upvoted a collection about 1 month ago

📄 FinePDFs

Collection

81 items • Updated Nov 11 • 25

upvoted a paper 2 months ago

Robot Learning: A Tutorial

Paper • 2510.12403 • Published Oct 14 • 118

liked a Space 3 months ago

Transformers Timeline

🤗

Interactive timeline to explore the 🤗Transformers models

upvoted an article 3 months ago

Article

Metric and Relative Monocular Depth Estimation: An Overview. Fine-Tuning Depth Anything V2 👐 📚

Jul 10, 2024

•

upvoted 2 papers 3 months ago

Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents

Paper • 2507.04009 • Published Jul 5 • 51

StableAvatar: Infinite-Length Audio-Driven Avatar Video Generation

Paper • 2508.08248 • Published Aug 11 • 27

Manan Shah

AI & ML interests

Recent Activity

Organizations

cs-mshah's activity

SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data

WindowSeat Reflection Removal Web

We Got Claude to Fine-Tune an Open Source LLM

Continuous batching from first principles

Transformers Timeline

Metric and Relative Monocular Depth Estimation: An Overview. Fine-Tuning Depth Anything V2 👐 📚