Robotics

Improve model card

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +31 -7
README.md CHANGED
@@ -1,18 +1,42 @@
1
  ---
2
  license: mit
 
3
  ---
4
 
5
  # **SFHand – Official Checkpoint**
6
 
7
- This repository provides the **official pretrained checkpoint** for **SFHand**, a streaming framework for **language-guided 3D hand forecasting and embodied manipulation**.
8
 
9
  ---
10
 
11
  ## πŸ”— Project Links
12
 
13
- [![Paper](https://img.shields.io/badge/Paper-B31B1B?style=for-the-badge\&logo=arxiv\&logoColor=white)](https://arxiv.org/pdf/2511.18127)
14
- [![Data](https://img.shields.io/badge/Data-0040A1?style=for-the-badge\&logo=huggingface\&logoColor=ffffff)](https://huggingface.co/datasets/ut-vision/EgoHaFL)
15
- [![GitHub](https://img.shields.io/badge/GitHub-000000?style=for-the-badge\&logo=github\&logoColor=ffffff)](https://github.com/ut-vision/SFHand)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
  ---
18
 
@@ -20,11 +44,11 @@ This repository provides the **official pretrained checkpoint** for **SFHand**,
20
 
21
  If you use this model or find SFHand helpful in your research, please cite:
22
 
23
- ```latex
24
  @article{liu2025sfhand,
25
  title={SFHand: A Streaming Framework for Language-guided 3D Hand Forecasting and Embodied Manipulation},
26
- author={Liu, Ruicong and Huang, Yifei and Ouyang, Liangyang and Kang, Caixin and and Sato, Yoichi},
27
  journal={arXiv preprint arXiv:2511.18127},
28
  year={2025}
29
  }
30
- ```
 
1
  ---
2
  license: mit
3
+ pipeline_tag: robotics
4
  ---
5
 
6
  # **SFHand – Official Checkpoint**
7
 
8
+ This repository provides the **official pretrained checkpoint** for **SFHand**, a streaming framework for **language-guided 3D hand forecasting and embodied manipulation**, as introduced in the paper [SFHand: Learning Embodied Manipulation by Streaming Egocentric 3D Hand Forecasting](https://huggingface.co/papers/2511.18127).
9
 
10
  ---
11
 
12
  ## πŸ”— Project Links
13
 
14
+ - **Paper:** [arXiv:2511.18127](https://arxiv.org/abs/2511.18127)
15
+ - **GitHub:** [ut-vision/SFHand](https://github.com/ut-vision/SFHand)
16
+ - **Dataset:** [EgoHaFL](https://huggingface.co/datasets/ut-vision/EgoHaFL)
17
+
18
+ ---
19
+
20
+ ## πŸ“ Introduction
21
+
22
+ SFHand is the first streaming architecture for language-guided 3D hand forecasting. It autoregressively predicts future hand dynamics from continuous egocentric video and text instructions, outputting hand type, 2D bounding boxes, 3D poses, and 3D trajectories.
23
+
24
+ Key features include:
25
+ - **Streaming Framework:** Autoregressive multi-modal hand forecasting.
26
+ - **ROI-Enhanced Memory:** Captures temporal hand awareness while focusing on salient regions.
27
+ - **Embodied Ready:** Representations transfer effectively to downstream manipulation tasks.
28
+
29
+ ---
30
+
31
+ ## πŸš€ Evaluation and Visualization
32
+
33
+ To evaluate the model and generate visualizations using this checkpoint, you can run the following command from the [official repository](https://github.com/ut-vision/SFHand):
34
+
35
+ ```bash
36
+ python main.py --config_file configs/config/clip_base_eval.yml --eval --vis
37
+ ```
38
+
39
+ Output visualizations will be saved to the `./render_results/` directory.
40
 
41
  ---
42
 
 
44
 
45
  If you use this model or find SFHand helpful in your research, please cite:
46
 
47
+ ```bibtex
48
  @article{liu2025sfhand,
49
  title={SFHand: A Streaming Framework for Language-guided 3D Hand Forecasting and Embodied Manipulation},
50
+ author={Liu, Ruicong and Huang, Yifei and Ouyang, Liangyang and Kang, Caixin and Sato, Yoichi},
51
  journal={arXiv preprint arXiv:2511.18127},
52
  year={2025}
53
  }
54
+ ```