Instructions to use mradermacher/vincent-i1-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use mradermacher/vincent-i1-GGUF with Transformers:

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("mradermacher/vincent-i1-GGUF", dtype="auto")

llama-cpp-python

How to use mradermacher/vincent-i1-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="mradermacher/vincent-i1-GGUF",
	filename="vincent.i1-IQ1_M.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use mradermacher/vincent-i1-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf mradermacher/vincent-i1-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf mradermacher/vincent-i1-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf mradermacher/vincent-i1-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf mradermacher/vincent-i1-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf mradermacher/vincent-i1-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf mradermacher/vincent-i1-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf mradermacher/vincent-i1-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf mradermacher/vincent-i1-GGUF:Q4_K_M

Use Docker

docker model run hf.co/mradermacher/vincent-i1-GGUF:Q4_K_M

LM Studio
Jan
Ollama
How to use mradermacher/vincent-i1-GGUF with Ollama:
```
ollama run hf.co/mradermacher/vincent-i1-GGUF:Q4_K_M
```

Unsloth Studio

How to use mradermacher/vincent-i1-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for mradermacher/vincent-i1-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for mradermacher/vincent-i1-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for mradermacher/vincent-i1-GGUF to start chatting

How to use mradermacher/vincent-i1-GGUF with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf mradermacher/vincent-i1-GGUF:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "mradermacher/vincent-i1-GGUF:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use mradermacher/vincent-i1-GGUF with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf mradermacher/vincent-i1-GGUF:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default mradermacher/vincent-i1-GGUF:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use mradermacher/vincent-i1-GGUF with Docker Model Runner:
```
docker model run hf.co/mradermacher/vincent-i1-GGUF:Q4_K_M
```

Lemonade

How to use mradermacher/vincent-i1-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull mradermacher/vincent-i1-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.vincent-i1-GGUF-Q4_K_M

List all available models

lemonade list

vincent-i1-GGUF / README.md

mradermacher

auto-patch README.md

848f3a6 verified 4 months ago

preview code

raw

history blame contribute delete

6.09 kB

	---
	base_model: naturecodeproject/vincent
	extra_gated_fields:
	I agree to use this model responsibly: checkbox
	Intended Use: text
	Name: text
	Organization: text
	extra_gated_prompt: You are about to request access to Naturecode Vincent, a fine-tuned
	language model that embodies the persona of Vincent van Gogh. By requesting access,
	you agree to use this model responsibly for research, educational, or creative purposes
	only. Commercial use requires explicit permission from Naturecode.
	language:
	- en
	library_name: transformers
	license: other
	license_link: https://naturecode.xyz/license
	license_name: naturecode-research
	mradermacher:
	readme_rev: 1
	quantized_by: mradermacher
	tags:
	- conversational
	- art
	- history
	- vincent-van-gogh
	- persona
	- fine-tuned
	---
	## About

	<!-- ### quantize_version: 2 -->
	<!-- ### output_tensor_quantised: 1 -->
	<!-- ### convert_type: hf -->
	<!-- ### vocab_type: -->
	<!-- ### tags: nicoboss -->
	<!-- ### quants: Q2_K IQ3_M Q4_K_S IQ3_XXS Q3_K_M small-IQ4_NL Q4_K_M IQ2_M Q6_K IQ4_XS Q2_K_S IQ1_M Q3_K_S IQ2_XXS Q3_K_L IQ2_XS Q5_K_S IQ2_S IQ1_S Q5_K_M Q4_0 IQ3_XS Q4_1 IQ3_S -->
	<!-- ### quants_skip: -->
	<!-- ### skip_mmproj: -->
	weighted/imatrix quants of https://huggingface.co/naturecodeproject/vincent

	<!-- provided-files -->

	*For a convenient overview and download list, visit our [model page for this model](https://hf.tst.eu/model#vincent-i1-GGUF).*

	static quants are available at https://huggingface.co/mradermacher/vincent-GGUF
	## Usage

	If you are unsure how to use GGUF files, refer to one of [TheBloke's
	READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
	more details, including on how to concatenate multi-part files.

	## Provided Quants

	(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

	\| Link \| Type \| Size/GB \| Notes \|
	\|:-----\|:-----\|--------:\|:------\|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.imatrix.gguf) \| imatrix \| 0.1 \| imatrix file (for creating your own quants) \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-IQ1_S.gguf) \| i1-IQ1_S \| 1.7 \| for the desperate \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-IQ1_M.gguf) \| i1-IQ1_M \| 1.9 \| mostly desperate \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-IQ2_XXS.gguf) \| i1-IQ2_XXS \| 2.1 \| \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-IQ2_XS.gguf) \| i1-IQ2_XS \| 2.3 \| \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-IQ2_S.gguf) \| i1-IQ2_S \| 2.4 \| \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-IQ2_M.gguf) \| i1-IQ2_M \| 2.6 \| \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-Q2_K_S.gguf) \| i1-Q2_K_S \| 2.6 \| very low quality \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-Q2_K.gguf) \| i1-Q2_K \| 2.8 \| IQ3_XXS probably better \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-IQ3_XXS.gguf) \| i1-IQ3_XXS \| 2.9 \| lower quality \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-IQ3_XS.gguf) \| i1-IQ3_XS \| 3.1 \| \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-Q3_K_S.gguf) \| i1-Q3_K_S \| 3.3 \| IQ3_XS probably better \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-IQ3_S.gguf) \| i1-IQ3_S \| 3.3 \| beats Q3_K* \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-IQ3_M.gguf) \| i1-IQ3_M \| 3.4 \| \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-Q3_K_M.gguf) \| i1-Q3_K_M \| 3.6 \| IQ3_S probably better \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-Q3_K_L.gguf) \| i1-Q3_K_L \| 3.9 \| IQ3_M probably better \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-IQ4_XS.gguf) \| i1-IQ4_XS \| 4.0 \| \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-Q4_0.gguf) \| i1-Q4_0 \| 4.2 \| fast, low quality \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-IQ4_NL.gguf) \| i1-IQ4_NL \| 4.2 \| prefer IQ4_XS \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-Q4_K_S.gguf) \| i1-Q4_K_S \| 4.2 \| optimal size/speed/quality \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-Q4_K_M.gguf) \| i1-Q4_K_M \| 4.5 \| fast, recommended \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-Q4_1.gguf) \| i1-Q4_1 \| 4.7 \| \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-Q5_K_S.gguf) \| i1-Q5_K_S \| 5.1 \| \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-Q5_K_M.gguf) \| i1-Q5_K_M \| 5.2 \| \|
	\| [GGUF](https://huggingface.co/mradermacher/vincent-i1-GGUF/resolve/main/vincent.i1-Q6_K.gguf) \| i1-Q6_K \| 6.0 \| practically like static Q6_K \|

	Here is a handy graph by ikawrakow comparing some lower-quality quant
	types (lower is better):

	![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)

	And here are Artefact2's thoughts on the matter:
	https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

	## FAQ / Model Request

	See https://huggingface.co/mradermacher/model_requests for some answers to
	questions you might have and/or if you want some other model quantized.

	## Thanks

	I thank my company, [nethype GmbH](https://www.nethype.de/), for letting
	me use its servers and providing upgrades to my workstation to enable
	this work in my free time. Additional thanks to [@nicoboss](https://huggingface.co/nicoboss) for giving me access to his private supercomputer, enabling me to provide many more imatrix quants, at much higher quality, than I would otherwise be able to.

	<!-- end -->