Alittlehammmer commited on
Commit
dd17fe9
·
verified ·
1 Parent(s): 2022ad9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -25,18 +25,18 @@ Converted to BF16 using `convert_hf_to_gguf.py`, then quantized using `llama-qua
25
 
26
  | Quant | Bits | Size | Notes |
27
  | ------ | ----- | ------- | -------------------------------- |
28
- | Q6_K | 6 | ~303 GB | Near-lossless, very high quality |
29
 
30
  ## Usage
31
 
32
- Download the shards (split into 7 parts) and run with llama.cpp. The split files are
33
  loaded automatically when you point at the first shard:
34
 
35
  ```bash
36
  llama-cli -m Ornith-1.0-397B-Q6_K/Ornith-1.0-397B-Q6_K-00001-of-00007.gguf -p "Hello"
37
  ```
38
 
39
- llama.cpp handles the `-00001-of-00007` split automatically, so you do not need to merge
40
  the shards manually.
41
 
42
  ## Original model
 
25
 
26
  | Quant | Bits | Size | Notes |
27
  | ------ | ----- | ------- | -------------------------------- |
28
+ | Q6_K | 6 | ~326 GB | Near-lossless, very high quality |
29
 
30
  ## Usage
31
 
32
+ Download the shards and run with llama.cpp. The split files are
33
  loaded automatically when you point at the first shard:
34
 
35
  ```bash
36
  llama-cli -m Ornith-1.0-397B-Q6_K/Ornith-1.0-397B-Q6_K-00001-of-00007.gguf -p "Hello"
37
  ```
38
 
39
+ llama.cpp handles the `-00001-of-0000X` split automatically, so you do not need to merge
40
  the shards manually.
41
 
42
  ## Original model