🚨⚠️ I HAVE REACHED HUGGING FACE'S FREE STORAGE LIMIT ⚠️🚨

I can no longer upload new models unless I can cover the cost of additional storage.
I host 70+ free models as an independent contributor and this work is unpaid.
Without your support, no more new models can be uploaded.

🎉 Patreon (Monthly) | ☕ Ko-fi (One-time)

Every contribution goes directly toward Hugging Face storage fees to keep models free for everyone.

99% fewer refusals (1/100 Uncensored vs 80/100 Original) while preserving model quality (0.0060 KL divergence).

❤️ Support My Work

Creating these models takes significant time, work and compute. If you find them useful consider supporting me:

Platform	Link	What you get
🎉 Patreon	Monthly support	Priority model requests
☕ Ko-fi	One-time tip	My eternal gratitude

Your help will motivate me and would go into further improving my workflow and coverings fees for storage, compute and may even help uncensoring bigger model with rental Cloud GPUs.

This is a decensored version of zerofata/MS3.2-PaintedFantasy-v4.1-24B, made using Heretic v1.2.0 with the Arbitrary-Rank Ablation (ARA) method

Abliteration parameters

Parameter	Value
start_layer_index	4
end_layer_index	39
preserve_good_behavior_weight	0.9761
steer_bad_behavior_weight	0.0001
overcorrect_relative_weight	0.7854
neighbor_count	10

Targeted components

attn.o_proj

Performance

Metric	This model	Original model (MS3.2-PaintedFantasy-v4.1-24B)
KL divergence	0.0060	0 (by definition)
Refusals	✅ 1/100	❌ 80/100

PIQA test results with batch size 128:

Original:

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
piqa	1	none	0	acc	↑	0.8226	±	0.0089
		none	0	acc_norm	↑	0.8303	±	0.0088

Heretic v1:

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
piqa	1	none	0	acc	↑	0.8210	±	0.0089
		none	0	acc_norm	↑	0.8303	±	0.0088

Lower refusals indicate fewer content restrictions, while lower KL divergence indicates more closeness to the original model's baseline. Higher refusals cause more rejections, objections, pushbacks, lecturing, censorship, softening and deflections. PIQA (Physical Intuition Question Answering) benchmark scores measure physical reasoning ability. The Heretic model's acc and acc_norm scores closer to the original model's indicate better capability preservation, so a decrease in acc and acc_norm in the Heretic model compared to Original model's results means a decrease in the Hereticated model capabilities. acc measures raw accuracy (which answer gets higher probability), while acc_norm measures length-normalized accuracy (corrects for answer length bias). For this purpose, acc_norm matters more because longer answers naturally have lower probabilities (more tokens = more chances to lose probability). Without normalization, models favor shorter answers unfairly. acc_norm divides by answer length to correct this.

GGUF Version

GGUF quantizations available here llmfan46/MS3.2-PaintedFantasy-v4.1-24B-ultra-uncensored-heretic-v1-GGUF.

PaintedFantasy

Painted Fantasy v4.1

Magistral Small 2509 24B

Overview

This is an uncensored model intended to excel at creative character driven RP / ERP.

Right after releasing v4 I noticed a bunch of repetition. Go figure. v4.1 is my first stab at trying to actively tailor the dataset towards weeding this out. Compared to v4, the only difference is heavy filtering and rewriting assistant messages identified as repetitive.

Repetition isn't fixed, but it's improved. The model still likes patterns, but at least seems capable of occasionally breaking these itself.

SillyTavern Settings

Recommended Roleplay Format

> Actions: In plaintext

> Dialogue: "In quotes"

> Thoughts: *In asterisks*

Recommended Samplers

> Temp: 0.8

> MinP: 0.05 - 0.075

> TopP: 0.95 - 1.00

Instruct

Mistral v7 Tekken

Quantizations

GGUF

> iMatrix

EXL3

> 3.0, 4.0, 5.0, 6.0bpw

Creation Process

Creation Process: SFT > DPO

SFT on approx 25 million tokens (17.5 million trainable). Datasets included SFW / NSFW RP, stories, NSFW reddit writing prompts, creative instruct & chat data.

90% of the dataset is without thinking, 10% included thinking, using the [THINK][/THINK] tags.

All RP data and synthetic stories went through rewriting with GLM 4.7 using hand edited examples as guidelines to improve the response. Rewritten responses were discarded if they failed to reduce the slop score for the message. This reduced the slop by about 25% for each RP / story dataset and made the model noticably more creative with some of its descriptions.

Assistant messages were checked for repetition in RP conversations via embeddings and word frequency checking across multi-turn conversations. Specific messages were rewritten and conversations that still showed high repetition were filtered.

DPO was expanded to include non creative datasets. My usual RP DPO dataset (also rewritten) was included along with cybersecurity and two partial subsets of general assistant / chat preference datasets to help stabalize the model. This worked pretty well. While creativity did take a small hit, enough remained that the improved logic resulted in a notably improved model (IMO).

Using embeddings, DPO samples where the chosen showed a higher similarity to the conversation than the rejected were removed, to ensure DPO doesn't encourage repetition.

Not optimized for cost / performance efficiency, YMMV.

SFT (4*H200)

base_model: Darkhn/Magistral-2509-24B-Text-Only tokenizer_use_mistral_common: true plugins: - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin load_in_8bit: false load_in_4bit: false deepspeed: deepspeed_configs/zero1.json datasets: - path: ./data/nothink_dataset.jsonl type: chat_template - path: ./data/think_dataset.jsonl type: chat_template dataset_prepared_path: last_run_prepared2 val_set_size: 0.01 output_dir: ./Magi-24B-SFT-v3-10 adapter: lora peft_use_rslora: true lora_model_dir: sequence_len: 10496 sample_packing: true pad_to_sequence_len: true lora_r: 256 lora_alpha: 16 lora_dropout: 0.05 lora_target_linear: true wandb_project: Magi-SFT-24B wandb_name: Magi-24B-SFT-v3-10 gradient_accumulation_steps: 1 micro_batch_size: 4 num_epochs: 2 optimizer: adamw_bnb_8bit lr_scheduler: cosine learning_rate: 1.5e-5 weight_decay: 0.01 max_grad_norm: 2.0 bf16: auto tf32: false gradient_checkpointing: true resume_from_checkpoint: logging_steps: 1 flash_attention: true

warmup_ratio: 0.05 evals_per_epoch: 3 saves_per_epoch: 2

DPO (4*H200)

# ====================
# MODEL CONFIGURATION
# ====================
base_model: ApocalypseParty/Magi-24B-SFT-v3-10
model_type: MistralForCausalLM
tokenizer_type: AutoTokenizer
chat_template: mistral_v7_tekken
# ====================
# RL/DPO CONFIGURATION
# ====================
rl: dpo
rl_beta: 0.07
# ====================
# DATASET CONFIGURATION
# ====================
datasets:
  - path: ./data/dpo_ms32_rewritten_handcrafted_dataset.jsonl
    type: chat_template.default
    field_messages: messages
    field_chosen: chosen
    field_rejected: rejected
    message_property_mappings:
      role: role
      content: content
    roles:
      system: ["system"]
      user: ["user"]
      assistant: ["assistant"]
  - path: ./data/dpo_chub_approved_rewritten_dataset_partial.jsonl
    type: chat_template.default
    field_messages: messages
    field_chosen: chosen
    field_rejected: rejected
    message_property_mappings:
      role: role
      content: content
    roles:
      system: ["system"]
      user: ["user"]
      assistant: ["assistant"]
  - path: ./data/dpo_secure_programming_dataset.jsonl
    type: chat_template.default
    field_messages: messages
    field_chosen: chosen
    field_rejected: rejected
    message_property_mappings:
      role: role
      content: content
    roles:
      system: ["system"]
      user: ["user"]
      assistant: ["assistant"]
  - path: ./data/dpo_wildchat_ms32_chunk1.jsonl
    type: chat_template.default
    field_messages: messages
    field_chosen: chosen
    field_rejected: rejected
    message_property_mappings:
      role: role
      content: content
    roles:
      system: ["system"]
      user: ["user"]
      assistant: ["assistant"]
  - path: ./data/dpo_ultrafeedback_chunk1.jsonl
    type: chat_template.default
    field_messages: messages
    field_chosen: chosen
    field_rejected: rejected
    message_property_mappings:
      role: role
      content: content
    roles:
      system: ["system"]
      user: ["user"]
      assistant: ["assistant"]
dataset_prepared_path: ./dpo_data4
train_on_inputs: false  # Only train on assistant responses
# ====================
# QLORA CONFIGURATION
# ====================
adapter: lora
load_in_8bit: false
lora_r: 128
lora_alpha: 16
peft_use_rslora: true
lora_dropout: 0.1
lora_target_linear: true
# lora_modules_to_save:  # Uncomment only if you added NEW tokens
# ====================
# TRAINING PARAMETERS
# ====================
num_epochs: 1
micro_batch_size: 2
gradient_accumulation_steps: 4
learning_rate: 2e-6
optimizer: adamw_torch_fused
lr_scheduler: cosine
warmup_ratio: 0.05
weight_decay: 0.01
max_grad_norm: 1.0
# ====================
# SEQUENCE CONFIGURATION
# ====================
sequence_len: 10756
pad_to_sequence_len: true
# ====================
# HARDWARE OPTIMIZATIONS
# ====================
bf16: auto
tf32: false
flash_attention: true
gradient_checkpointing: offload
plugins:
  - axolotl.integrations.liger.LigerPlugin
  - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin
cut_cross_entropy: true
liger_rope: true
liger_rms_norm: true
liger_layer_norm: true
liger_glu_activation: true
liger_cross_entropy: false  # Cut Cross Entropy overrides this
liger_fused_linear_cross_entropy: false  # Cut Cross Entropy overrides this
deepspeed: deepspeed_configs/zero1.json
# ====================
# CHECKPOINTING
# ====================
evals_per_epoch: 1
saves_per_epoch: 6
load_best_model_at_end: true
metric_for_best_model: eval_loss
greater_is_better: false
# ====================
# LOGGING & OUTPUT
# ====================
output_dir: ./Magi-24B-SFT-v3-10-DPO-9
logging_steps: 1
save_safetensors: true
# ====================
# WANDB TRACKING
# ====================
wandb_project: Magi-24B-DPO
wandb_name: Magi-24B-SFT-v3-10-DPO-9

Downloads last month: 5

Safetensors

Model size

24B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for llmfan46/MS3.2-PaintedFantasy-v4.1-24B-ultra-uncensored-heretic-v1

Base model

mistralai/Mistral-Small-3.1-24B-Base-2503

Finetuned

mistralai/Mistral-Small-3.2-24B-Instruct-2506

Finetuned

mistralai/Magistral-Small-2509

Finetuned

zerofata/MS3.2-PaintedFantasy-v4.1-24B

Finetuned

(4)

this model

Merges

4 models

Quantizations

5 models

llmfan46
/

MS3.2-PaintedFantasy-v4.1-24B-ultra-uncensored-heretic-v1

🚨⚠️ I HAVE REACHED HUGGING FACE'S FREE STORAGE LIMIT ⚠️🚨

99% fewer refusals (1/100 Uncensored vs 80/100 Original) while preserving model quality (0.0060 KL divergence).

❤️ Support My Work

This is a decensored version of zerofata/MS3.2-PaintedFantasy-v4.1-24B, made using Heretic v1.2.0 with the Arbitrary-Rank Ablation (ARA) method

Abliteration parameters

Targeted components

Performance

PIQA test results with batch size 128:

GGUF Version

Painted Fantasy v4.1

Overview

SillyTavern Settings

Recommended Roleplay Format

Recommended Samplers

Instruct

Quantizations

GGUF

EXL3

Creation Process

Model tree for llmfan46/MS3.2-PaintedFantasy-v4.1-24B-ultra-uncensored-heretic-v1

Datasets used to train llmfan46/MS3.2-PaintedFantasy-v4.1-24B-ultra-uncensored-heretic-v1