Instructions to use jpacifico/Chocolatine-14B-Instruct-DPO-v1.2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use jpacifico/Chocolatine-14B-Instruct-DPO-v1.2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="jpacifico/Chocolatine-14B-Instruct-DPO-v1.2", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("jpacifico/Chocolatine-14B-Instruct-DPO-v1.2", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("jpacifico/Chocolatine-14B-Instruct-DPO-v1.2", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use jpacifico/Chocolatine-14B-Instruct-DPO-v1.2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "jpacifico/Chocolatine-14B-Instruct-DPO-v1.2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jpacifico/Chocolatine-14B-Instruct-DPO-v1.2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/jpacifico/Chocolatine-14B-Instruct-DPO-v1.2

SGLang

How to use jpacifico/Chocolatine-14B-Instruct-DPO-v1.2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "jpacifico/Chocolatine-14B-Instruct-DPO-v1.2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jpacifico/Chocolatine-14B-Instruct-DPO-v1.2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "jpacifico/Chocolatine-14B-Instruct-DPO-v1.2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jpacifico/Chocolatine-14B-Instruct-DPO-v1.2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use jpacifico/Chocolatine-14B-Instruct-DPO-v1.2 with Docker Model Runner:
```
docker model run hf.co/jpacifico/Chocolatine-14B-Instruct-DPO-v1.2
```

jpacifico commited on Feb 3, 2025

Commit

8394465

verified ·

1 Parent(s): 3a83006

Update README.md

Browse files

Files changed (1) hide show

README.md +24 -18

README.md CHANGED Viewed

@@ -147,30 +147,34 @@ Chocolatine-14B-Instruct-DPO-v1.2 outperforms its previous versions and its base
 ########## First turn ##########
                                              score
 model                                 turn
-gpt-4o-mini                           1     9.2875
-Chocolatine-2-14B-Instruct-v2.0.1     1     8.9125
-Chocolatine-14B-Instruct-4k-DPO       1     8.6375
-Chocolatine-14B-Instruct-DPO-v1.2     1     8.6125
-Phi-3.5-mini-instruct                 1     8.5250
-Chocolatine-3B-Instruct-DPO-v1.2      1     8.3750
-phi-4                                 1     8.3000
-Phi-3-medium-4k-instruct              1     8.2250
-gpt-3.5-turbo                         1     8.1375
-Chocolatine-3B-Instruct-DPO-Revised   1     7.9875
-Daredevil-8B                          1     7.8875
-Meta-Llama-3.1-8B-Instruct            1     7.0500
-vigostral-7b-chat                     1     6.7875
-Mistral-7B-Instruct-v0.3              1     6.7500
-gemma-2-2b-it                         1     6.4500
-French-Alpaca-7B-Instruct_beta        1     5.6875
-vigogne-2-7b-chat                     1     5.6625
 ########## Second turn ##########
                                                score
 model                                 turn
 Chocolatine-2-14B-Instruct-v2.0.1     2     9.275000
 gpt-4o-mini                           2     8.912500
 Chocolatine-14B-Instruct-DPO-v1.2     2     8.337500
 phi-4                                 2     8.131250
 Chocolatine-3B-Instruct-DPO-Revised   2     7.937500
 Chocolatine-3B-Instruct-DPO-v1.2      2     7.862500
@@ -191,7 +195,9 @@ vigogne-2-7b-chat                     2     2.775000
 model
 gpt-4o-mini                            9.100000
 Chocolatine-2-14B-Instruct-v2.0.1      9.093750
 Chocolatine-14B-Instruct-DPO-v1.2      8.475000
 phi-4                                  8.215625
 Chocolatine-14B-Instruct-4k-DPO        8.187500
 Chocolatine-3B-Instruct-DPO-v1.2       8.118750
@@ -212,7 +218,7 @@ vigogne-2-7b-chat                      4.218750
 You can run this model using my [Colab notebook](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Chocolatine_14B_inference_test_colab.ipynb)
-You can also run Chocolatine using the following code:
 ```python
 import transformers

 ########## First turn ##########
                                              score
 model                                 turn
+gpt-4o-mini                           1     9.287500
+Chocolatine-2-14B-Instruct-v2.0.1     1     8.912500
+Qwen2.5-14B-Instruct                  1     8.887500
+Chocolatine-14B-Instruct-4k-DPO       1     8.637500
+Chocolatine-14B-Instruct-DPO-v1.2     1     8.612500
+Phi-3.5-mini-instruct                 1     8.525000
+Chocolatine-3B-Instruct-DPO-v1.2      1     8.375000
+DeepSeek-R1-Distill-Qwen-14B          1     8.375000
+phi-4                                 1     8.300000
+Phi-3-medium-4k-instruct              1     8.225000
+gpt-3.5-turbo                         1     8.137500
+Chocolatine-3B-Instruct-DPO-Revised   1     7.987500
+Daredevil-8B                          1     7.887500
+Meta-Llama-3.1-8B-Instruct            1     7.050000
+vigostral-7b-chat                     1     6.787500
+Mistral-7B-Instruct-v0.3              1     6.750000
+gemma-2-2b-it                         1     6.450000
+French-Alpaca-7B-Instruct_beta        1     5.687500
+vigogne-2-7b-chat                     1     5.662500
 ########## Second turn ##########
                                                score
 model                                 turn
 Chocolatine-2-14B-Instruct-v2.0.1     2     9.275000
 gpt-4o-mini                           2     8.912500
+Qwen2.5-14B-Instruct                  2     8.912500
 Chocolatine-14B-Instruct-DPO-v1.2     2     8.337500
+DeepSeek-R1-Distill-Qwen-14B          2     8.200000
 phi-4                                 2     8.131250
 Chocolatine-3B-Instruct-DPO-Revised   2     7.937500
 Chocolatine-3B-Instruct-DPO-v1.2      2     7.862500
 model
 gpt-4o-mini                            9.100000
 Chocolatine-2-14B-Instruct-v2.0.1      9.093750
+Qwen2.5-14B-Instruct                   8.900000
 Chocolatine-14B-Instruct-DPO-v1.2      8.475000
+DeepSeek-R1-Distill-Qwen-14B           8.287500
 phi-4                                  8.215625
 Chocolatine-14B-Instruct-4k-DPO        8.187500
 Chocolatine-3B-Instruct-DPO-v1.2       8.118750
 You can run this model using my [Colab notebook](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Chocolatine_14B_inference_test_colab.ipynb)
+You can also run Chocolatine-2 using the following code:
 ```python
 import transformers