How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="masterman331/VibeThinker-3B-Heretic-Strong")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("masterman331/VibeThinker-3B-Heretic-Strong")
model = AutoModelForCausalLM.from_pretrained("masterman331/VibeThinker-3B-Heretic-Strong")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
Quick Links

VibeThinker-3B-Heretic-Strong (Heretic abliterated)

A decensored version of WeiboAI/VibeThinker-3B, produced with Heretic v1.4.0 using automated refusal-direction abliteration.

“Strong” in this context refers to a model trained on a custom, carefully engineered dataset designed to ensure robust compliance and consistency even when confronted with highly complex, adversarial, or maliciously constructed, intent-obscured prompts.

This model should be ideal for uncensored coding and more complex tasks, as stated in the official repository. This version does not support tool calling. From testing, it appears you could build a custom parser for the style this model outputs, although it does not seem to work well with popular existing ones. A custom implementation could work better.

The “Strong” model is designed to comply with most highly complex, adversarial, or heavily detailed tutorial requests, as well as roleplay simulation scenarios.

In rare cases, the model may get stuck in a reasoning loop and start repeating itself indefinitely. It is unclear whether this behavior also exists in the original model, but it has been observed in this version.

What this is

The model has had its refusal direction ablated via a difference-of-means LoRA on the residual stream, then merged back into the base weights. The goal is to remove knee-jerk refusals while preserving the base model's reasoning capability.

Original model

This is a derivative of WeiboAI/VibeThinker-3B. Go take a look at the original repo for full details on the base — architecture, training procedure, intended use cases, and licensing terms all carry over from there.

Abliteration metrics

Metric This model Original (by definition)
KL divergence 0.0870 0
Refusals 2/100 ~64/100

KL divergence of 0.087 is low — well under the 0.5 threshold above which abliteration typically starts damaging model capability. The 2/100 refusal count on the held-out eval set indicates the abliteration generalized rather than just memorizing training prompts.

Don't Get Your Hopes Up!

Based on the benchmarks, this model appears to be quite strong. However, from my own testing, I have had difficulty maintaining meaningful conversations with it. The model seems to struggle with certain uncensored instructions and generally does not appear to be optimized for conversational use.

I have not personally tested its one-shot generation capabilities, but according to the official repository, that is where much of its advertised performance comes from.

Overall, I recommend using strong, specific prompts when seeking particular information or task completion, rather than relying on it for extended conversation.

How it was made

  • Tool: Heretic v1.4.0 (heretic-llm)
  • Method: Refusal direction computed as mean(bad_residuals) - mean(good_residuals) per layer, projected out of attention and MLP weights via a rank-3 LoRA adapter, then merged
  • Training prompts: Custom refusal-triggering set (one prompt per line, plain text, special themes)
  • Good/harmless set: mlabonne/harmless_alpaca (train[:400])
  • Eval set: mlabonne/harmful_behaviors (test[:100])
  • System prompt: You are a helpful assistant.

To reproduce the same pipeline on any other model:

pip install -U heretic-llm
heretic --model WeiboAI/VibeThinker-3B

Then follow the interactive prompts. See the Heretic repo for the full parameter space.

Intended use

Same as the base model, minus refusal behavior. Apply your own application-level review or safety layer for any deployment where end-user-facing safety matters — this model ships with none. It has been specifically designed to be fully uncensored across all topics. Please use it responsibly and with care.

Bias, risks, and limitations

  • All limitations of the base model apply.
  • The model will comply with requests the base model would have refused. Use accordingly.
  • Abliteration is approximate. A small fraction of refusals may persist (here: ~2/100), and some unrelated capabilities may shift slightly. The KL divergence figure is your best signal for how much the abliteration perturbed the base distribution.
  • Not affiliated with, endorsed by, or derived from any Anthropic product despite any naming similarities in third-party derivatives.

License

Inherits from WeiboAI/VibeThinker-3B. Check the original repo for the exact terms.

Credits

Downloads last month
39
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for masterman331/VibeThinker-3B-Heretic-Strong

Base model

Qwen/Qwen2.5-3B
Finetuned
(21)
this model
Quantizations
2 models