Instructions to use abideen/Phi-3-mini-4K-instruct-cpo-simpo with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use abideen/Phi-3-mini-4K-instruct-cpo-simpo with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="abideen/Phi-3-mini-4K-instruct-cpo-simpo", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("abideen/Phi-3-mini-4K-instruct-cpo-simpo", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("abideen/Phi-3-mini-4K-instruct-cpo-simpo", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use abideen/Phi-3-mini-4K-instruct-cpo-simpo with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "abideen/Phi-3-mini-4K-instruct-cpo-simpo"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "abideen/Phi-3-mini-4K-instruct-cpo-simpo",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/abideen/Phi-3-mini-4K-instruct-cpo-simpo

SGLang

How to use abideen/Phi-3-mini-4K-instruct-cpo-simpo with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "abideen/Phi-3-mini-4K-instruct-cpo-simpo" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "abideen/Phi-3-mini-4K-instruct-cpo-simpo",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "abideen/Phi-3-mini-4K-instruct-cpo-simpo" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "abideen/Phi-3-mini-4K-instruct-cpo-simpo",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use abideen/Phi-3-mini-4K-instruct-cpo-simpo with Docker Model Runner:
```
docker model run hf.co/abideen/Phi-3-mini-4K-instruct-cpo-simpo
```

Phi-3-mini-4K-instruct with CPO-SimPO

This repository contains the Phi-3-mini-128K-instruct model enhanced with the CPO-SimPO technique. CPO-SimPO combines Contrastive Preference Optimization (CPO) and Simple Preference Optimization (SimPO).

Introduction

Phi-3-mini-4K-instruct is a model optimized for instruction-based tasks. This approach has demonstrated notable improvements in key benchmarks, pushing the boundaries of AI preference learning.

What is CPO-SimPO?

CPO-SimPO is a novel technique, which combines elements from CPO and SimPO:

Contrastive Preference Optimization (CPO): Adds a behavior cloning regularizer to ensure the model remains close to the preferred data distribution.
Simple Preference Optimization (SimPO): Incorporates length normalization and target reward margins to prevent the generation of long but low-quality sequences.

Github

CPO-SIMPO

Model Performance

COMING SOON!

Key Improvements:

Enhanced Model Performance: Significant score improvements, particularly in GSM8K (up by 8.49 points!) and TruthfulQA (up by 2.07 points).
Quality Control: Improved generation of high-quality sequences through length normalization and reward margins.
Balanced Optimization: The BC regularizer helps maintain the integrity of learned preferences without deviating from the preferred data distribution.

Usage

Installation

To use this model, you need to install the transformers library from Hugging Face.

pip install transformers

Inference

Here's an example of how to perform inference with the model:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

torch.random.manual_seed(0)

model = AutoModelForCausalLM.from_pretrained(
    "Syed-Hasan-8503/Phi-3-mini-4K-instruct-cpo-simpo", 
    device_map="cuda", 
    torch_dtype="auto", 
    trust_remote_code=True, 
)
tokenizer = AutoTokenizer.from_pretrained("Syed-Hasan-8503/Phi-3-mini-4K-instruct-cpo-simpo")

messages = [
    {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
    {"role": "assistant", "content": "Sure! Here are some ways to eat bananas and dragonfruits together: 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey. 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey."},
    {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
]

pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)

generation_args = {
    "max_new_tokens": 500,
    "return_full_text": False,
    "temperature": 0.0,
    "do_sample": False,
}

output = pipe(messages, **generation_args)
print(output[0]['generated_text'])

Downloads last month: 2

Safetensors

Model size

4B params

Tensor type

BF16