Not-For-All-Audiences

Instructions to use InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF",
	filename="DarkForest-20B-v2.0-iMat-IQ1_M.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF:Q4_K_M

Use Docker

docker model run hf.co/InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF:Q4_K_M

LM Studio
Jan
Ollama
How to use InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF with Ollama:
```
ollama run hf.co/InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF:Q4_K_M
```

Unsloth Studio

How to use InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF to start chatting

Atomic Chat new
Docker Model Runner
How to use InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF with Docker Model Runner:
```
docker model run hf.co/InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF:Q4_K_M
```

Lemonade

How to use InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.DarkForest-20B-v2.0-iMat-GGUF-Q4_K_M

List all available models

lemonade list

DarkForest 20B v2.0 iMat GGUF

"The universe is a dark forest. Every civilization is an armed hunter stalking through the trees like a ghost, gently pushing aside branches that block the path and trying to tread without sound. Even breathing is done with care. The hunter has to be careful, because everywhere in the forest are stealthy hunters like him."- Liu Cixin

Quantized from fp16 with love. Importance Matrix calculated using Q8_0 quant and wiki.train.raw

For a brief rundown of iMatrix quant performance please see this PR

All quants are verified working prior to uploading to repo for your safety and convenience.

Importance matrix quantizations are a work in progress, IQ3 and above is recommended for best results.

Tip: Pick a size that can fit in your GPU while still allowing some room for context for best speed. You may need to pad this further depending on if you are running image gen or TTS as well.

Original model card can be found here

Previous Model Card

Continuation of an ongoing initiative to bring the latest and greatest models to consumer hardware through SOTA techniques that reduce VRAM overhead.

After testing the new important matrix quants for 11b and 8x7b models and being able to run them on machines without a dedicated GPU, we are now exploring the middleground - 20b.

❗❗Need a different quantization/model? Please open a community post and I'll get back to you - thanks ❗❗

UPDATE 3/4/24: Newer quants (IQ4_XS, IQ2_S, etc) are confirmed working in Koboldcpp as of version 1.60 - if you run into any issues kindly let me know.

IQ3_S has been generated after PR #5829 was merged. This should provide a significant speed boost even if you are offloading to CPU.

(Credits to TeeZee for the original model and ikawrakow for the stellar work on IQ quants)

DarkForest 20B v2.0

Model Details

To create this model two step procedure was used. First a new 20B model was created using microsoft/Orca-2-13b and KoboldAI/LLaMA2-13B-Erebus-v3 , deatils of the merge in darkforest_v2_step1.yml
then jebcarter/psyonic-cetacean-20B
and TeeZee/BigMaid-20B-v1.0 was used to produce the final model, merge config in darkforest_v2_step2.yml
The resulting model has approximately 20 billion parameters.

Warning: This model can produce NSFW content!

Results

main difference to v1.0 - model has much better sense of humor.
produces SFW nad NSFW content without issues, switches context seamlessly.
good at following instructions.
good at tracking multiple characters in one scene.
very creative, scenarios produced are mature and complicated, model doesn't shy from writing about PTSD, menatal issues or complicated relationships.
NSFW output is more creative and suprising than typical limaRP output.
definitely for mature audiences, not only because of vivid NSFW content but also because of overall maturity of stories it produces.
This is NOT Harry Potter level storytelling.

All comments are greatly appreciated, download, test and if you appreciate my work, consider buying me my fuel:

Downloads last month: 127

GGUF

Model size

20B params

Architecture

llama

Hardware compatibility

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including InferenceIllusionist/DarkForest-20B-v2.0-iMat-GGUF

GGUFs

Collection

I take requests, feel free to drop me a line in the community posts • 31 items • Updated Apr 28 • 3