Instructions to use Salesforce/instructblip-flan-t5-xl with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Salesforce/instructblip-flan-t5-xl with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="Salesforce/instructblip-flan-t5-xl")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("Salesforce/instructblip-flan-t5-xl") model = AutoModelForImageTextToText.from_pretrained("Salesforce/instructblip-flan-t5-xl") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Salesforce/instructblip-flan-t5-xl with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Salesforce/instructblip-flan-t5-xl" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Salesforce/instructblip-flan-t5-xl", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Salesforce/instructblip-flan-t5-xl
- SGLang
How to use Salesforce/instructblip-flan-t5-xl with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Salesforce/instructblip-flan-t5-xl" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Salesforce/instructblip-flan-t5-xl", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Salesforce/instructblip-flan-t5-xl" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Salesforce/instructblip-flan-t5-xl", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Salesforce/instructblip-flan-t5-xl with Docker Model Runner:
docker model run hf.co/Salesforce/instructblip-flan-t5-xl
How to Deploy on Inference Endpoints - handler.py
Hello,
I'd love to deploy this to the Huggingface Inference Endpoint, however, it's missing the handler.py file.
Does anyone have any tips? https://huggingface.co/docs/inference-endpoints/guides/custom_handler
When I tried to deploy it I received this error:
56f5d4ff8bqr6jr 2023-07-10T16:39:16.116Z INFO | Start loading image artifacts from huggingface.co
56f5d4ff8bqr6jr 2023-07-10T16:39:16.116Z INFO | Repository Revision: 6c0cf6bef6330a114473cb5cec43d7beeb2a74ac
56f5d4ff8bqr6jr 2023-07-10T16:39:16.116Z INFO | Used configuration:
56f5d4ff8bqr6jr 2023-07-10T16:39:16.116Z INFO | Repository ID: Salesforce/instructblip-flan-t5-xl
56f5d4ff8bqr6jr 2023-07-10T16:39:16.116Z INFO | Ignore regex pattern for files, which are not downloaded: tf*, flax*, rust*, *onnx, *safetensors, *mlmodel, *tflite, *tar.gz, *ckpt
56f5d4ff8bqr6jr 2023-07-10T16:42:45.193Z 2023-07-10 16:42:45,192 | INFO | Initializing model from directory:/repository
56f5d4ff8bqr6jr 2023-07-10T16:42:45.193Z 2023-07-10 16:42:45,193 | INFO | Using device GPU
56f5d4ff8bqr6jr 2023-07-10T16:42:45.193Z 2023-07-10 16:42:45,192 | INFO | No custom pipeline found at /repository/handler.py
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z return HuggingFaceHandler(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z await handler()
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z KeyError: 'instructblip'
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 917, in from_pretrained
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 654, in startup
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z self.pipeline = get_pipeline(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 623, in getitem
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/app/huggingface_inference_toolkit/handler.py", line 17, in init
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/app/webservice_starlette.py", line 57, in some_startup_task
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z config_class = CONFIG_MAPPING[config_dict["model_type"]]
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z raise KeyError(key)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z config = AutoConfig.from_pretrained(model, _from_pipeline=task, **hub_kwargs, **model_kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z async with self.lifespan_context(app) as maybe_state:
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/app/huggingface_inference_toolkit/utils.py", line 263, in get_pipeline
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/app/huggingface_inference_toolkit/handler.py", line 46, in get_inference_handler_either_custom_or_default_handler
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 677, in lifespan
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z Application startup failed. Exiting.
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/transformers/pipelines/init.py", line 692, in pipeline
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 566, in aenter
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z await self._router.startup()
56f5d4ff8bqr6jr 2023-07-10T16:42:45.194Z Traceback (most recent call last):
56f5d4ff8bqr6jr 2023-07-10T16:42:48.903Z 2023-07-10 16:42:48,903 | INFO | Initializing model from directory:/repository
56f5d4ff8bqr6jr 2023-07-10T16:42:48.903Z 2023-07-10 16:42:48,903 | INFO | No custom pipeline found at /repository/handler.py
56f5d4ff8bqr6jr 2023-07-10T16:42:48.903Z 2023-07-10 16:42:48,903 | INFO | Using device GPU
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z raise KeyError(key)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z self.pipeline = get_pipeline(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z await self._router.startup()
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/app/huggingface_inference_toolkit/utils.py", line 263, in get_pipeline
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 677, in lifespan
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/app/huggingface_inference_toolkit/handler.py", line 17, in init
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 566, in aenter
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z config = AutoConfig.from_pretrained(model, _from_pipeline=task, **hub_kwargs, **model_kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z KeyError: 'instructblip'
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/app/huggingface_inference_toolkit/handler.py", line 46, in get_inference_handler_either_custom_or_default_handler
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 654, in startup
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/pipelines/init.py", line 692, in pipeline
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z Traceback (most recent call last):
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z config_class = CONFIG_MAPPING[config_dict["model_type"]]
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z await handler()
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 623, in getitem
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/app/webservice_starlette.py", line 57, in some_startup_task
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z return HuggingFaceHandler(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z async with self.lifespan_context(app) as maybe_state:
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 917, in from_pretrained
56f5d4ff8bqr6jr 2023-07-10T16:42:48.904Z Application startup failed. Exiting.
56f5d4ff8bqr6jr 2023-07-10T16:43:06.920Z 2023-07-10 16:43:06,920 | INFO | Initializing model from directory:/repository
56f5d4ff8bqr6jr 2023-07-10T16:43:06.920Z 2023-07-10 16:43:06,920 | INFO | Using device GPU
56f5d4ff8bqr6jr 2023-07-10T16:43:06.920Z 2023-07-10 16:43:06,920 | INFO | No custom pipeline found at /repository/handler.py
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z Application startup failed. Exiting.
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z config_class = CONFIG_MAPPING[config_dict["model_type"]]
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 917, in from_pretrained
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z async with self.lifespan_context(app) as maybe_state:
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z config = AutoConfig.from_pretrained(model, _from_pipeline=task, **hub_kwargs, **model_kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z Traceback (most recent call last):
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/app/huggingface_inference_toolkit/handler.py", line 46, in get_inference_handler_either_custom_or_default_handler
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 677, in lifespan
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z KeyError: 'instructblip'
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 566, in aenter
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z await handler()
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/transformers/pipelines/init.py", line 692, in pipeline
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z self.pipeline = get_pipeline(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z return HuggingFaceHandler(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/app/webservice_starlette.py", line 57, in some_startup_task
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 654, in startup
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 623, in getitem
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/app/huggingface_inference_toolkit/utils.py", line 263, in get_pipeline
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z File "/app/huggingface_inference_toolkit/handler.py", line 17, in init
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z raise KeyError(key)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:43:06.921Z await self._router.startup()
56f5d4ff8bqr6jr 2023-07-10T16:43:36.903Z 2023-07-10 16:43:36,902 | INFO | Initializing model from directory:/repository
56f5d4ff8bqr6jr 2023-07-10T16:43:36.903Z 2023-07-10 16:43:36,903 | INFO | No custom pipeline found at /repository/handler.py
56f5d4ff8bqr6jr 2023-07-10T16:43:36.903Z 2023-07-10 16:43:36,903 | INFO | Using device GPU
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 623, in getitem
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/app/huggingface_inference_toolkit/utils.py", line 263, in get_pipeline
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/configuration_auto.py", line 917, in from_pretrained
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z KeyError: 'instructblip'
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z self.pipeline = get_pipeline(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z return HuggingFaceHandler(model_dir=model_dir, task=task)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/app/webservice_starlette.py", line 57, in some_startup_task
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z hf_pipeline = pipeline(task=task, model=model_dir, device=device, **kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 566, in aenter
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z await self._router.startup()
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 654, in startup
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z config_class = CONFIG_MAPPING[config_dict["model_type"]]
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/transformers/pipelines/init.py", line 692, in pipeline
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z config = AutoConfig.from_pretrained(model, _from_pipeline=task, **hub_kwargs, **model_kwargs)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 677, in lifespan
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z Application startup failed. Exiting.
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z raise KeyError(key)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/app/huggingface_inference_toolkit/handler.py", line 17, in init
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z async with self.lifespan_context(app) as maybe_state:
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z File "/app/huggingface_inference_toolkit/handler.py", line 46, in get_inference_handler_either_custom_or_default_handler
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z await handler()
56f5d4ff8bqr6jr 2023-07-10T16:43:36.904Z Traceback (most recent call last):
Hi,
Have you implemented a handler script as explained in the guide above? If so, could you share this script?
@nielsr I tried to submit a PR, I have no idea if this works or not https://huggingface.co/Salesforce/instructblip-flan-t5-xl/discussions/5.
Any help? Thanks!
Hi,
There's no need to add the handler.py within this repository. You can just create a new model repository and add a handler.py script there, where you use:
model = InstructBlipForConditionalGeneration.from_pretrained("Salesforce/instructblip-flan-t5-xl")
in the init of the handler class.
