---
license: apache-2.0
base_model: Qwen/Qwen3.5-35B-A3B-Base
tags:
  - gguf
  - fine-tuned
  - tool-calling
  - mcp
  - dbt
  - q8_0
---

# ecu-pilot (GGUF Q8_0)

Quantized GGUF of [ecu-pilot-fp16](https://huggingface.co/mach-kernel/ecu-pilot-fp16) — a fine-tuned Qwen3.5-35B-A3B for structured tool calling against project metadata via MCP.

## Quantization

| | |
|---|---|
| **Source** | [mach-kernel/ecu-pilot-fp16](https://huggingface.co/mach-kernel/ecu-pilot-fp16) |
| **Method** | Q8_0 via llama.cpp |
| **Size** | ~35 GB |
| **Architecture** | Mixture of Experts (35B total, 3B active per token) |

## Usage

### Ollama

```bash
echo 'FROM ./ecu-pilot-q8_0.gguf
PARAMETER temperature 0.2
PARAMETER num_ctx 8192
PARAMETER stop <|im_end|>' > Modelfile

ollama create ecu-pilot -f Modelfile
ollama run ecu-pilot
```

### llama.cpp

```bash
llama-cli -m ecu-pilot-q8_0.gguf -ngl 99 -cnv
```

## All variants

| Format | Repository | Size |
|--------|-----------|------|
| FP16 | [mach-kernel/ecu-pilot-fp16](https://huggingface.co/mach-kernel/ecu-pilot-fp16) | ~67 GB |
| GGUF Q4_K_M | [mach-kernel/ecu-pilot-q4km](https://huggingface.co/mach-kernel/ecu-pilot-q4km) | ~20 GB |
| GGUF Q8_0 (this repo) | [mach-kernel/ecu-pilot-q8_0](https://huggingface.co/mach-kernel/ecu-pilot-q8_0) | ~35 GB |
| LoRA adapter | [mach-kernel/ecu-pilot-fp16-lora](https://huggingface.co/mach-kernel/ecu-pilot-fp16-lora) | ~4 GB |

## Why "ecu"

No reason. Just liked how it sounded. Definitely not a Caesar cipher of anything. Don't look into it.