Instructions to use TIGER-Lab/VLM2Vec-Full with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use TIGER-Lab/VLM2Vec-Full with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="TIGER-Lab/VLM2Vec-Full", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("TIGER-Lab/VLM2Vec-Full", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use TIGER-Lab/VLM2Vec-Full with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "TIGER-Lab/VLM2Vec-Full"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TIGER-Lab/VLM2Vec-Full",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/TIGER-Lab/VLM2Vec-Full

SGLang

How to use TIGER-Lab/VLM2Vec-Full with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "TIGER-Lab/VLM2Vec-Full" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TIGER-Lab/VLM2Vec-Full",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "TIGER-Lab/VLM2Vec-Full" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TIGER-Lab/VLM2Vec-Full",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use TIGER-Lab/VLM2Vec-Full with Docker Model Runner:
```
docker model run hf.co/TIGER-Lab/VLM2Vec-Full
```

Performance on MTEB

by JustJaro - opened Dec 1, 2024

Discussion

JustJaro

Dec 1, 2024

Hi,

I was wondering if using this model for both text and image embeddings would degrade text performance; from the benchmarks its not quite clear how it stands on MTEB.

Could you shed some light on it? Is it better/worse than for example intfloat/e5-mistral-7b-instruct?

Thanks for your help.

Cheers,
Jaro

ziyjiang

TIGER-Lab org Dec 1, 2024

@JustJaro

This is a great question. Currently, we haven’t tested it yet, but it is part of our plan.
I expect that the results on MTEB may not be as strong as the current state-of-the-art text embedding models, as we haven't trained on any text-only data. One of our key next steps is to combine both text and current image pairwise data and train a model. We believe that incorporating more text pairwise data could also benefit image-related tasks, based on insights from other literature (such as E5-v).

JustJaro

Dec 3, 2024

Thanks for your answer. Sounds like a great plan - and cheers for the great work on the embedding model!

ziyjiang changed discussion status to closed Dec 16, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment