Not-For-All-Audiences

Instructions to use Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs",
	filename="CREC-n-WREC-Mate-24B-v2-bf16.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs:Q4_K_M

Use Docker

docker model run hf.co/Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs:Q4_K_M

LM Studio
Jan

vLLM

How to use Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs:Q4_K_M

Ollama
How to use Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs with Ollama:
```
ollama run hf.co/Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs:Q4_K_M
```

Unsloth Studio

How to use Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs to start chatting

Docker Model Runner
How to use Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs with Docker Model Runner:
```
docker model run hf.co/Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs:Q4_K_M
```

Lemonade

How to use Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs:Q4_K_M

Run and chat with the model

lemonade run user.CREC-n-WREC-Mate-24B-v2-GGUFs-Q4_K_M

List all available models

lemonade list

CREC-n-WREC-Mate-24B-v2

THIS MODEL IS UNOFFICIAL!

This model has no official affiliation with Weather and his SillyTavern Extensions. This is simply a fan project to help fellow users of these extensions.

Merge Description

CREC-n-WREC-Mate is a model made to help create World Info entries mid-roleplay using the SillyTavern extensions CREC and WREC.

The responses a bit on the shorter side by default, but this should be all the more beneficial for creating World Info entries. Needless to say, this isn't a model designed for creating Char Cards, instead it's meant for saving characters you encounter on your adventures to a Lorebook, so make sure to enable the feature that allows adding characters to a WI entry in the CREC settings menu.

WREC Setup: here

CREC Setup: here

Merge Details

This is a merge of pre-trained language models created using mergekit.

Merge Method

This model was merged using the Conflict-Aware N:M Sparsification merge method using TheDrummer/Cydonia-24B-v2.1 as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configurations were used to produce this model:

C-n-W-CharGen_v2

models:
  - model: CharGen-Archive/CharGen-v3-beta-275-s0
  - model: Mawdistical/Mawdistic-NightLife-24b
    parameters:
      weight: 0.3
      n_val: 64
      m_val: 128
  - model: SlerpE/CardProjector-24B-v3
    parameters:
      weight: 0.2
      n_val: 16
      m_val: 32
merge_method: cabs
pruning_order:
  - Mawdistical/Mawdistic-NightLife-24b
  - SlerpE/CardProjector-24B-v3
base_model: CharGen-Archive/CharGen-v3-beta-275-s0
dtype: float32
tokenizer:
  source: union
  tokens:
    </s>:
      source:
        kind: model_token
        model: CharGen-Archive/CharGen-v3-beta-275-s0
        token: "<|im_end|>"
    "[INST]":
      source:
        kind: model_token
        model: CharGen-Archive/CharGen-v3-beta-275-s0
        token: "<|im_start|>"

C-n-W-CardProj_v2

models:
  - model: SlerpE/CardProjector-24B-v3
  - model: ReadyArt/Broken-Tutu-24B
    parameters:
      weight: 0.3
      n_val: 64
      m_val: 128
  - model: CharGen-Archive/CharGen-v3-beta-275-s0
    parameters:
      weight: 0.2
      n_val: 16
      m_val: 32
merge_method: cabs
pruning_order:
  - ReadyArt/Broken-Tutu-24B
  - CharGen-Archive/CharGen-v3-beta-275-s0
base_model: SlerpE/CardProjector-24B-v3
dtype: float32
tokenizer:
  source: union
  tokens:
    "[/INST]":
      source:
        kind: model_token
        model: CharGen-Archive/CharGen-v3-beta-275-s0
        token: "<|im_end|>"
      source:
        kind: model_token
        model: CharGen-Archive/CharGen-v3-beta-275-s0
        token: "<|im_start|>"

CREC-n-WREC-Mate-24B-v2

models:
  - model: TheDrummer/Cydonia-24B-v2.1
  - model: C-n-W-CardProj_v2
    parameters:
      weight: 0.6
      n_val: 64
      m_val: 128
  - model: C-n-W-CharGen_v2
    parameters:
      weight: 0.4
      n_val: 12
      m_val: 32
merge_method: cabs
pruning_order:
  - C-n-W-CardProj_v2
  - C-n-W-CharGen_v2
base_model: TheDrummer/Cydonia-24B-v2.1
dtype: float32
out_dtype: bfloat16
tokenizer:
  source: union
  tokens:
    "[/INST]":
      source:
        kind: model_token
        model: C-n-W-CardProj_v2
        token: "[/INST]"
      source:
        kind: model_token
        model: C-n-W-CharGen_v2
        token: "[/INST]"
    "[INST]":
      source:
        kind: model_token
        model: C-n-W-CardProj_v2
        token: "[INST]"
      source:
        kind: model_token
        model: C-n-W-CharGen_v2
        token: "[INST]"
    </s>:
      source:
        kind: model_token
        model: C-n-W-CardProj_v2
        token: "</s>"
      source:
        kind: model_token
        model: C-n-W-CharGen_v2
        token: "</s>"

Downloads last month: 36

GGUF

Model size

24B params

Architecture

llama

Hardware compatibility

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Model tree for Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs

Base model

Casual-Autopsy/CREC-n-WREC-Mate-24B-v2

Quantized

(3)

this model

Paper for Casual-Autopsy/CREC-n-WREC-Mate-24B-v2-GGUFs

CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging

Paper • 2503.01874 • Published Feb 26, 2025 • 2