hostkimjang commited on
Commit
0d197da
·
verified ·
1 Parent(s): 2e0bd3b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md CHANGED
@@ -109,6 +109,25 @@ Known limitations:
109
  - side effects may exist (tone shift, verbosity changes, occasional riskier completions)
110
  - evaluation is not exhaustive; additional red-teaming is recommended
111
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
112
  ---
113
 
114
  ## How to Use
 
109
  - side effects may exist (tone shift, verbosity changes, occasional riskier completions)
110
  - evaluation is not exhaustive; additional red-teaming is recommended
111
 
112
+ ---
113
+ ## GGUF (llama.cpp) Inference
114
+
115
+ This repository also provides an **F16 GGUF** build under `gguf/`, intended for running with **llama.cpp**.
116
+
117
+ ### Run with `llama-server` (Thinking ON)
118
+
119
+ > This command enables the model's "thinking" behavior via `--chat-template-kwargs`.
120
+
121
+ #### Linux / macOS
122
+
123
+ ```bash
124
+ ./llama-server \
125
+ -m {PATH}/HyperCLOVAX-SEED-Think-32B-heretic2.f16.gguf \
126
+ --host 0.0.0.0 --port 10000 \
127
+ --jinja \
128
+ --chat-template-kwargs '{"thinking":true,"enable_thinking":true}' \
129
+ -cb -fa on
130
+
131
  ---
132
 
133
  ## How to Use