Instructions to use Salesforce/instructblip-flan-t5-xl with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Salesforce/instructblip-flan-t5-xl with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="Salesforce/instructblip-flan-t5-xl")

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("Salesforce/instructblip-flan-t5-xl")
model = AutoModelForImageTextToText.from_pretrained("Salesforce/instructblip-flan-t5-xl")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Salesforce/instructblip-flan-t5-xl with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Salesforce/instructblip-flan-t5-xl"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Salesforce/instructblip-flan-t5-xl",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Salesforce/instructblip-flan-t5-xl

SGLang

How to use Salesforce/instructblip-flan-t5-xl with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Salesforce/instructblip-flan-t5-xl" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Salesforce/instructblip-flan-t5-xl",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Salesforce/instructblip-flan-t5-xl" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Salesforce/instructblip-flan-t5-xl",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use Salesforce/instructblip-flan-t5-xl with Docker Model Runner:
```
docker model run hf.co/Salesforce/instructblip-flan-t5-xl
```

How to Deploy on Inference Endpoints - handler.py

by brianjking - opened Jul 10, 2023

Discussion

brianjking

Jul 10, 2023

Hello,

I'd love to deploy this to the Huggingface Inference Endpoint, however, it's missing the handler.py file.

Does anyone have any tips? https://huggingface.co/docs/inference-endpoints/guides/custom_handler

When I tried to deploy it I received this error:

56f5d4ff8bqr6jr 2023-07-10T16:39:16.116Z INFO | Start loading image artifacts from huggingface.co
56f5d4ff8bqr6jr 2023-07-10T16:39:16.116Z INFO | Repository Revision: 6c0cf6bef6330a114473cb5cec43d7beeb2a74ac
56f5d4ff8bqr6jr 2023-07-10T16:39:16.116Z INFO | Used configuration:
56f5d4ff8bqr6jr 2023-07-10T16:39:16.116Z INFO | Repository ID: Salesforce/instructblip-flan-t5-xl
56f5d4ff8bqr6jr 2023-07-10T16:39:16.116Z INFO | Ignore regex pattern for files, which are not downloaded: tf*, flax*, rust*, *onnx, *safetensors, *mlmodel, *tflite, *tar.gz, *ckpt
56f5d4ff8bqr6jr 2023-07-10T16:42:45.193Z 2023-07-10 16:42:45,192 | INFO | Initializing model from directory:/repository
56f5d4ff8bqr6jr 2023-07-10T16:42:45.193Z 2023-07-10 16:42:45,193 | INFO | Using device GPU
56f5d4ff8bqr6jr 2023-07-10T16:42:45.193Z 2023-07-10 16:42:45,192 | INFO | No custom pipeline found at /repository/handler.py
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z return HuggingFaceHandler(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z await handler()
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z KeyError: 'instructblip'
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 917, in from_pretrained
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 654, in startup
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z self.pipeline = get_pipeline(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 623, in getitem
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/app/huggingface_inference_toolkit/handler.py", line 17, in init
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/app/webservice_starlette.py", line 57, in some_startup_task
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z config_class = CONFIG_MAPPING[config_dict["model_type"]]
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z raise KeyError(key)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z config = AutoConfig.from_pretrained(model, _from_pipeline=task, **hub_kwargs, **model_kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z async with self.lifespan_context(app) as maybe_state:
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/app/huggingface_inference_toolkit/utils.py", line 263, in get_pipeline
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/app/huggingface_inference_toolkit/handler.py", line 46, in get_inference_handler_either_custom_or_default_handler
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 677, in lifespan
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z Application startup failed. Exiting.
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/transformers/pipelines/init.py", line 692, in pipeline
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 566, in aenter
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z await self._router.startup()
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z Traceback (most recent call last):
56f5d4ff8bqr6jr 2023-07-10T16:42:48.903Z 2023-07-10 16:42:48,903 | INFO | Initializing model from directory:/repository
56f5d4ff8bqr6jr 2023-07-10T16:42:48.903Z 2023-07-10 16:42:48,903 | INFO | No custom pipeline found at /repository/handler.py
56f5d4ff8bqr6jr 2023-07-10T16:42:48.903Z 2023-07-10 16:42:48,903 | INFO | Using device GPU
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z raise KeyError(key)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z self.pipeline = get_pipeline(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z await self._router.startup()
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/app/huggingface_inference_toolkit/utils.py", line 263, in get_pipeline
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 677, in lifespan
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/app/huggingface_inference_toolkit/handler.py", line 17, in init
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 566, in aenter
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z config = AutoConfig.from_pretrained(model, _from_pipeline=task, **hub_kwargs, **model_kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z KeyError: 'instructblip'
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/app/huggingface_inference_toolkit/handler.py", line 46, in get_inference_handler_either_custom_or_default_handler
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 654, in startup
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/pipelines/init.py", line 692, in pipeline
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z Traceback (most recent call last):
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z config_class = CONFIG_MAPPING[config_dict["model_type"]]
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z await handler()
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 623, in getitem
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/app/webservice_starlette.py", line 57, in some_startup_task
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z return HuggingFaceHandler(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z async with self.lifespan_context(app) as maybe_state:
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 917, in from_pretrained
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z Application startup failed. Exiting.
56f5d4ff8bqr6jr 2023-07-10T16:43:06.920Z 2023-07-10 16:43:06,920 | INFO | Initializing model from directory:/repository
56f5d4ff8bqr6jr 2023-07-10T16:43:06.920Z 2023-07-10 16:43:06,920 | INFO | Using device GPU
56f5d4ff8bqr6jr 2023-07-10T16:43:06.920Z 2023-07-10 16:43:06,920 | INFO | No custom pipeline found at /repository/handler.py
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z Application startup failed. Exiting.
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z config_class = CONFIG_MAPPING[config_dict["model_type"]]
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 917, in from_pretrained
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z async with self.lifespan_context(app) as maybe_state:
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z config = AutoConfig.from_pretrained(model, _from_pipeline=task, **hub_kwargs, **model_kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z Traceback (most recent call last):
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/app/huggingface_inference_toolkit/handler.py", line 46, in get_inference_handler_either_custom_or_default_handler
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 677, in lifespan
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z KeyError: 'instructblip'
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 566, in aenter
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z await handler()
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/transformers/pipelines/init.py", line 692, in pipeline
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z self.pipeline = get_pipeline(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z return HuggingFaceHandler(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/app/webservice_starlette.py", line 57, in some_startup_task
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 654, in startup
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 623, in getitem
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/app/huggingface_inference_toolkit/utils.py", line 263, in get_pipeline
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/app/huggingface_inference_toolkit/handler.py", line 17, in init
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z raise KeyError(key)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z await self._router.startup()
56f5d4ff8bqr6jr 2023-07-10T16:43:36.903Z 2023-07-10 16:43:36,902 | INFO | Initializing model from directory:/repository
56f5d4ff8bqr6jr 2023-07-10T16:43:36.903Z 2023-07-10 16:43:36,903 | INFO | No custom pipeline found at /repository/handler.py
56f5d4ff8bqr6jr 2023-07-10T16:43:36.903Z 2023-07-10 16:43:36,903 | INFO | Using device GPU
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 623, in getitem
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/app/huggingface_inference_toolkit/utils.py", line 263, in get_pipeline
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 917, in from_pretrained
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z KeyError: 'instructblip'
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z self.pipeline = get_pipeline(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z return HuggingFaceHandler(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/app/webservice_starlette.py", line 57, in some_startup_task
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 566, in aenter
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z await self._router.startup()
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 654, in startup
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z config_class = CONFIG_MAPPING[config_dict["model_type"]]
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/pipelines/init.py", line 692, in pipeline
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z config = AutoConfig.from_pretrained(model, _from_pipeline=task, **hub_kwargs, **model_kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 677, in lifespan
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z Application startup failed. Exiting.
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z raise KeyError(key)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/app/huggingface_inference_toolkit/handler.py", line 17, in init
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z async with self.lifespan_context(app) as maybe_state:
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/app/huggingface_inference_toolkit/handler.py", line 46, in get_inference_handler_either_custom_or_default_handler
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z await handler()
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z Traceback (most recent call last):

brianjking changed discussion title from How to Deploy on Inference Endpoints to How to Deploy on Inference Endpoints - handler.py Jul 10, 2023

nielsr

Jul 10, 2023

Hi,

Have you implemented a handler script as explained in the guide above? If so, could you share this script?

brianjking

Jul 10, 2023

@nielsr I tried to submit a PR, I have no idea if this works or not https://huggingface.co/Salesforce/instructblip-flan-t5-xl/discussions/5.

Any help? Thanks!

nielsr

Feb 10, 2024

Hi,

There's no need to add the handler.py within this repository. You can just create a new model repository and add a handler.py script there, where you use:

model = InstructBlipForConditionalGeneration.from_pretrained("Salesforce/instructblip-flan-t5-xl")

in the init of the handler class.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment