Instructions to use matrixportalx/aya-23-8B-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use matrixportalx/aya-23-8B-GGUF with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("matrixportalx/aya-23-8B-GGUF", dtype="auto") - llama-cpp-python
How to use matrixportalx/aya-23-8B-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="matrixportalx/aya-23-8B-GGUF", filename="aya-23-8b-f16.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use matrixportalx/aya-23-8B-GGUF with llama.cpp:
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh # Start a local OpenAI-compatible server with a web UI: llama serve -hf matrixportalx/aya-23-8B-GGUF:Q4_K_M # Run inference directly in the terminal: llama cli -hf matrixportalx/aya-23-8B-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama serve -hf matrixportalx/aya-23-8B-GGUF:Q4_K_M # Run inference directly in the terminal: llama cli -hf matrixportalx/aya-23-8B-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf matrixportalx/aya-23-8B-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf matrixportalx/aya-23-8B-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf matrixportalx/aya-23-8B-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf matrixportalx/aya-23-8B-GGUF:Q4_K_M
Use Docker
docker model run hf.co/matrixportalx/aya-23-8B-GGUF:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use matrixportalx/aya-23-8B-GGUF with Ollama:
ollama run hf.co/matrixportalx/aya-23-8B-GGUF:Q4_K_M
- Unsloth Studio
How to use matrixportalx/aya-23-8B-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for matrixportalx/aya-23-8B-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for matrixportalx/aya-23-8B-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for matrixportalx/aya-23-8B-GGUF to start chatting
- Atomic Chat new
- Docker Model Runner
How to use matrixportalx/aya-23-8B-GGUF with Docker Model Runner:
docker model run hf.co/matrixportalx/aya-23-8B-GGUF:Q4_K_M
- Lemonade
How to use matrixportalx/aya-23-8B-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull matrixportalx/aya-23-8B-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.aya-23-8B-GGUF-Q4_K_M
List all available models
lemonade list
Matrix commited on
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -28,7 +28,7 @@ library_name: transformers
|
|
| 28 |
license: cc-by-nc-4.0
|
| 29 |
tags:
|
| 30 |
- llama-cpp
|
| 31 |
-
-
|
| 32 |
inference: false
|
| 33 |
extra_gated_prompt: By submitting this form, you agree to the [License Agreement](https://cohere.com/c4ai-cc-by-nc-license) and
|
| 34 |
acknowledge that the information you provide will be collected, used, and shared
|
|
@@ -292,66 +292,27 @@ extra_gated_fields:
|
|
| 292 |
I agree to use this model for non-commercial use ONLY: checkbox
|
| 293 |
---
|
| 294 |
|
| 295 |
-
## Quant List
|
| 296 |
-
|
| 297 |
-
You can download the desired quant version from the list here.
|
| 298 |
-
| Link | Type | Size/GB | Notes |
|
| 299 |
-
|:-----|:-----|--------:|:------|
|
| 300 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q2_k.gguf) | Q2_K | 3.44 | |
|
| 301 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_s.gguf) | Q3_K_S | 3.87 | |
|
| 302 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_m.gguf) | Q3_K_M | 4.22 | lower quality |
|
| 303 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_l.gguf) | Q3_K_L | 4.53 | |
|
| 304 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q4_0.gguf) | Q4_0 | 4.80 | Arm, fast |
|
| 305 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q4_k_s.gguf) | Q4_K_S | 4.83 | fast, recommended |
|
| 306 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_m.gguf) | Q4_K_M | 5.06 | fast, recommended |
|
| 307 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_0.gguf) | Q5_0 | 5.67 | |
|
| 308 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_k_s.gguf) | Q5_K_S | 5.67 | |
|
| 309 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_k_m.gguf) | Q5_K_M | 5.8 | |
|
| 310 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q6_k.gguf) | Q6_K | 6.60 | very good quality |
|
| 311 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q8_0.gguf) | Q8_0 | 8.54 | fast, best quality |
|
| 312 |
-
| [Download here](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-f16.gguf) | f16 | 16.07 | 16 bpw, overkill |
|
| 313 |
-
|
| 314 |
-
|
| 315 |
# matrixportal/aya-23-8B-GGUF
|
| 316 |
-
This model was converted to GGUF format from [`CohereForAI/aya-23-8B`](https://huggingface.co/CohereForAI/aya-23-8B) using llama.cpp via the ggml.ai's [
|
| 317 |
Refer to the [original model card](https://huggingface.co/CohereForAI/aya-23-8B) for more details on the model.
|
| 318 |
|
| 319 |
-
##
|
| 320 |
-
|
| 321 |
-
|
| 322 |
-
```bash
|
| 323 |
-
brew install llama.cpp
|
| 324 |
-
|
| 325 |
-
```
|
| 326 |
-
Invoke the llama.cpp server or the CLI.
|
| 327 |
-
|
| 328 |
-
### CLI:
|
| 329 |
-
```bash
|
| 330 |
-
llama-cli --hf-repo matrixportal/aya-23-8B-GGUF --hf-file aya-23-8b-q8_0.gguf -p "The meaning to life and the universe is"
|
| 331 |
-
```
|
| 332 |
-
|
| 333 |
-
### Server:
|
| 334 |
-
```bash
|
| 335 |
-
llama-server --hf-repo matrixportal/aya-23-8B-GGUF --hf-file aya-23-8b-q8_0.gguf -c 2048
|
| 336 |
-
```
|
| 337 |
-
|
| 338 |
-
Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.
|
| 339 |
-
|
| 340 |
-
Step 1: Clone llama.cpp from GitHub.
|
| 341 |
-
```
|
| 342 |
-
git clone https://github.com/ggerganov/llama.cpp
|
| 343 |
-
```
|
| 344 |
|
| 345 |
-
|
| 346 |
-
|
| 347 |
-
|
| 348 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 349 |
|
| 350 |
-
|
| 351 |
-
```
|
| 352 |
-
./llama-cli --hf-repo matrixportal/aya-23-8B-GGUF --hf-file aya-23-8b-q8_0.gguf -p "The meaning to life and the universe is"
|
| 353 |
-
```
|
| 354 |
-
or
|
| 355 |
-
```
|
| 356 |
-
./llama-server --hf-repo matrixportal/aya-23-8B-GGUF --hf-file aya-23-8b-q8_0.gguf -c 2048
|
| 357 |
-
```
|
|
|
|
| 28 |
license: cc-by-nc-4.0
|
| 29 |
tags:
|
| 30 |
- llama-cpp
|
| 31 |
+
- matrixportal
|
| 32 |
inference: false
|
| 33 |
extra_gated_prompt: By submitting this form, you agree to the [License Agreement](https://cohere.com/c4ai-cc-by-nc-license) and
|
| 34 |
acknowledge that the information you provide will be collected, used, and shared
|
|
|
|
| 292 |
I agree to use this model for non-commercial use ONLY: checkbox
|
| 293 |
---
|
| 294 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 295 |
# matrixportal/aya-23-8B-GGUF
|
| 296 |
+
This model was converted to GGUF format from [`CohereForAI/aya-23-8B`](https://huggingface.co/CohereForAI/aya-23-8B) using llama.cpp via the ggml.ai's [all-gguf-same-where](https://huggingface.co/spaces/matrixportal/all-gguf-same-where) space.
|
| 297 |
Refer to the [original model card](https://huggingface.co/CohereForAI/aya-23-8B) for more details on the model.
|
| 298 |
|
| 299 |
+
## ✅ Quantized Models Download List
|
| 300 |
+
**✨ Recommended for CPU:** `Q4_K_M` | **⚡ Recommended for ARM CPU:** `Q4_0` | **🏆 Best Quality:** `Q8_0`
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 301 |
|
| 302 |
+
| 🚀 Download | 🔢 Type | 📝 Notes |
|
| 303 |
+
|:---------|:-----|:------|
|
| 304 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q2_k.gguf) |  | Basic quantization |
|
| 305 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_s.gguf) |  | Small size |
|
| 306 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_m.gguf) |  | Balanced quality |
|
| 307 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q3_k_l.gguf) |  | Better quality |
|
| 308 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q4_0.gguf) |  | Fast on ARM |
|
| 309 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q4_k_s.gguf) |  | Fast, recommended |
|
| 310 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q4_k_m.gguf) |  ⭐ | Best balance |
|
| 311 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_0.gguf) |  | Good quality |
|
| 312 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_k_s.gguf) |  | Balanced |
|
| 313 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q5_k_m.gguf) |  | High quality |
|
| 314 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q6_k.gguf) |  🏆 | Very good quality |
|
| 315 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-q8_0.gguf) |  ⚡ | Fast, best quality |
|
| 316 |
+
| [Download](https://huggingface.co/matrixportal/aya-23-8B-GGUF/resolve/main/aya-23-8b-f16.gguf) |  | Maximum accuracy |
|
| 317 |
|
| 318 |
+
💡 **Tip:** Use `F16` for maximum precision when quality is critical
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|