Eclipse-Senpai commited on
Commit
c6596b9
·
verified ·
1 Parent(s): 2421de5

Add F16 GGUF and model card

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. KeyLM-75M-Instruct.F16.gguf +3 -0
  3. README.md +52 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ KeyLM-75M-Instruct.F16.gguf filter=lfs diff=lfs merge=lfs -text
KeyLM-75M-Instruct.F16.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4c258bc317340147eb13f7455900560f958ed6f1f723fa58c90090cbad443a56
3
+ size 150979840
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ base_model: Eclipse-Senpai/KeyLM-75M-Instruct
6
+ base_model_relation: quantized
7
+ pipeline_tag: text-generation
8
+ library_name: gguf
9
+ tags:
10
+ - keylm
11
+ - gguf
12
+ - llama.cpp
13
+ - small-language-model
14
+ - instruct
15
+ ---
16
+
17
+ # KeyLM-75M-Instruct-GGUF
18
+
19
+ GGUF builds of [**KeyLM-75M-Instruct**](https://huggingface.co/Eclipse-Senpai/KeyLM-75M-Instruct) for `llama.cpp`, LM Studio, Ollama, and other GGUF runtimes.
20
+
21
+ KeyLM is a 75M-parameter instruction-tuned language model trained from scratch on approximately 18 billion tokens. See the [main model card](https://huggingface.co/Eclipse-Senpai/KeyLM-75M-Instruct) for benchmarks, training details, limitations, and the `transformers` (safetensors) version.
22
+
23
+ ## Files
24
+
25
+ | File | Quant | Size | Notes |
26
+ |---|---|---|---|
27
+ | `KeyLM-75M-Instruct.F16.gguf` | F16 | ~144 MB | Full precision and recommended. The model is already tiny, so there is little reason to quantize further. |
28
+
29
+ ## Run with llama.cpp
30
+
31
+ ```bash
32
+ # straight from the Hub
33
+ llama-cli -hf Eclipse-Senpai/KeyLM-75M-Instruct-GGUF -cnv
34
+
35
+ # or a local file
36
+ llama-cli -m KeyLM-75M-Instruct.F16.gguf -cnv
37
+ ```
38
+
39
+ The chat template (`User:` / `Assistant:`, assistant turns ending with `</s>`) is embedded in the GGUF, so conversation mode (`-cnv`) applies it automatically.
40
+
41
+ ## LM Studio / Ollama
42
+
43
+ - **LM Studio:** load the `.gguf`; the embedded chat template is detected automatically.
44
+ - **Ollama:** `ollama run hf.co/Eclipse-Senpai/KeyLM-75M-Instruct-GGUF`
45
+
46
+ ## Notes & limitations
47
+
48
+ KeyLM is a tiny model: good at simple instruction following and short chat, near random chance on knowledge/reasoning benchmarks. It is not a factual assistant. Full numbers and caveats are on the [main model card](https://huggingface.co/Eclipse-Senpai/KeyLM-75M-Instruct).
49
+
50
+ ## License
51
+
52
+ Apache 2.0.