Safetensors
gemma3
compressed-tensors
bnjmnmarie commited on
Commit
97c1867
·
verified ·
1 Parent(s): d0114c6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -0
README.md ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: gemma
3
+ datasets:
4
+ - kaitchup/opus100-translategemma-calib
5
+ base_model:
6
+ - google/translategemma-27b-it
7
+ ---
8
+
9
+
10
+ This is a quantized variant of **google/translategemma-27b-it**, created by **The Kaitchup** (newsletter: https://kaitchup.substack.com).
11
+
12
+ More details (training recipe, benchmarks, and recommended settings) will be added later. In the meantime, here are the current notes and a working inference example.
13
+
14
+ ## Status / limitations
15
+
16
+ - **Quick smoke test only** (not fully evaluated).
17
+ - **RoPE parameters were removed** for compatibility with **vLLM**. As a result, **long-context behavior may be degraded**. I have not verified the impact yet.
18
+ - **Chat template not supported (for now).** To use the model in vLLM, call a **completions** endpoint and provide a **fully formatted prompt**.
19
+
20
+ ## Serving with vLLM
21
+
22
+
23
+ ```
24
+ vllm serve kaitchup/translategemma-27b-it-FP8-Dynamic --max-model-len 2048 --chat-template-content-format openai --served-model-name gemma
25
+ ```
26
+
27
+ ```
28
+ curl -s http://localhost:8000/v1/completions -H "Content-Type: application/json" -d '{
29
+ "model": "gemma",
30
+ "prompt": "<bos><start_of_turn>user\nYou are a professional French (fr) to English (en) translator. Your goal is to accurately convey the meaning and nuances of the original French text while adhering to English grammar, vocabulary, and cultural sensitivities.\nProduce only the English translation, without any additional explanations or commentary. Please translate the following French text into English:\n\n\nJaime les pâtes !<end_of_turn>\n<start_of_turn>model\n",
31
+ "temperature": 0,
32
+ "max_tokens": 200,
33
+ "stop": ["<end_of_turn>"]
34
+ }'
35
+
36
+ ```