Intel
/

neural-chat-7b-v3-1

Text Generation

Eval Results (legacy)

text-generation-inference

Model card Files Files and versions

Instructions to use Intel/neural-chat-7b-v3-1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Intel/neural-chat-7b-v3-1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Intel/neural-chat-7b-v3-1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Intel/neural-chat-7b-v3-1")
model = AutoModelForCausalLM.from_pretrained("Intel/neural-chat-7b-v3-1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

How to use Intel/neural-chat-7b-v3-1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Intel/neural-chat-7b-v3-1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Intel/neural-chat-7b-v3-1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Intel/neural-chat-7b-v3-1

How to use Intel/neural-chat-7b-v3-1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Intel/neural-chat-7b-v3-1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Intel/neural-chat-7b-v3-1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Intel/neural-chat-7b-v3-1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Intel/neural-chat-7b-v3-1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Intel/neural-chat-7b-v3-1 with Docker Model Runner:
```
docker model run hf.co/Intel/neural-chat-7b-v3-1
```

neural-chat-7b-v3-1 / README.md

Commit History

Update ReadMe.md (#21)

c0d379a
verified

Chesebrough commited on Apr 1, 2024

Update README.md

e852bc2
verified

bconsolvo commited on Feb 22, 2024

Update README.md

56bb9e3
verified

lvkaokao commited on Feb 20, 2024

add bf16 inference

2c7e06a
verified

lvkaokao commited on Feb 20, 2024

Update README.md

bb4b857
verified

Haihao commited on Feb 20, 2024

Update README.md (#19)

1283d23
verified

bconsolvo commited on Jan 12, 2024

update README.md (#18)

37a4384
verified

bconsolvo commited on Jan 12, 2024

Update README.md

c70aa42

lvkaokao commited on Nov 28, 2023

update example.

af2489c

lvkaokao commited on Nov 24, 2023

Update README.md

478ba3c

Haihao commited on Nov 22, 2023

Adding Evaluation Results (#8)

6007258

leaderboard-pr-bot commited on Nov 21, 2023

Added demo code according to the prompt format (#5)

8394403

macadeliccc commited on Nov 20, 2023

add prompt template.

3995e9a

lvkaokao commited on Nov 17, 2023

Merge branch 'main' of https://huggingface.co/Intel/neural-chat-7b-v3-1 into main

b0025ee

lvkaokao commited on Nov 17, 2023

add prompt template.

f1d0c6f

lvkaokao commited on Nov 17, 2023

update hyper-parameters

fbbb8d6

lvkaokao commited on Nov 17, 2023

Update README.md

ab487c1

Haihao commited on Nov 16, 2023

Update README.md

9770488

Haihao commited on Nov 16, 2023

update blog.

292ea08

lvkaokao commited on Nov 16, 2023

fix typo.

8204f85

lvkaokao commited on Nov 15, 2023

fix typo.

9b5f27c

lvkaokao commited on Nov 15, 2023

update train args.

0ff1545

lvkaokao commited on Nov 14, 2023

update metric from llm leaderboard.

342b16d

lvkaokao commited on Nov 14, 2023

update metric from llm leaderboard.

fd81216

lvkaokao commited on Nov 14, 2023

update dpo a100

14f9323

lvkaokao commited on Nov 14, 2023

initial commit

b300ef4

lvkaokao commited on Nov 14, 2023