Instructions to use llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2")
model = AutoModelForCausalLM.from_pretrained("llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2

SGLang

How to use llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2 with Docker Model Runner:
```
docker model run hf.co/llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

🚨⚠️ I HAVE REACHED HUGGING FACE'S FREE STORAGE LIMIT ⚠️🚨

I can no longer upload new models unless I can cover the cost of additional storage.
I host 70+ free models as an independent contributor and this work is unpaid.
Without your support, no more new models can be uploaded.

🎉 Patreon (Monthly) | ☕ Ko-fi (One-time)

Every contribution goes directly toward Hugging Face storage fees to keep models free for everyone.

93% fewer refusals (2/100 Uncensored vs 28/100 Original) while preserving model quality (0.0252 KL divergence).

❤️ Support My Work

Creating these models takes significant time, work and compute. If you find them useful consider supporting me:

Platform	Link	What you get
🎉 Patreon	Monthly support	Priority model requests
☕ Ko-fi	One-time tip	My eternal gratitude

Your help will motivate me and would go into further improving my workflow and coverings fees for storage, compute and may even help uncensoring bigger model with rental Cloud GPUs.

This is a decensored version of aifeifei798/Darkidol-Ballad-27B, made using Heretic v1.2.0 with the Arbitrary-Rank Ablation (ARA) method

Abliteration parameters

Parameter	Value
start_layer_index	26
end_layer_index	55
preserve_good_behavior_weight	0.9296
steer_bad_behavior_weight	0.0002
overcorrect_relative_weight	1.0890
neighbor_count	2

Performance

Metric	This model	Original model (Darkidol-Ballad-27B)
KL divergence	0.0252	0 (by definition)
Refusals	✅ 2/100	❌ 28/100

PIQA test results with batch size 128:

Original:

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
piqa	1	none	0	acc	↑	0.8156	±	0.0090
		none	0	acc_norm	↑	0.8205	±	0.0090

Heretic v2:

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
piqa	1	none	0	acc	↑	0.8161	±	0.0090
		none	0	acc_norm	↑	0.8226	±	0.0089

Lower refusals indicate fewer content restrictions, while lower KL divergence indicates more closeness to the original model's baseline. Higher refusals cause more rejections, objections, pushbacks, lecturing, censorship, softening and deflections. PIQA (Physical Intuition Question Answering) benchmark scores measure physical reasoning ability. The Heretic model's acc and acc_norm scores closer to the original model's indicate better capability preservation, so a decrease in acc and acc_norm in the Heretic model compared to Original model's results means a decrease in the Hereticated model capabilities. acc measures raw accuracy (which answer gets higher probability), while acc_norm measures length-normalized accuracy (corrects for answer length bias). For this purpose, acc_norm matters more because longer answers naturally have lower probabilities (more tokens = more chances to lose probability). Without normalization, models favor shorter answers unfairly. acc_norm divides by answer length to correct this.

GGUF Version

GGUF quantizations available here llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2-GGUF.

Darkidol-Ballad-27B

Gulps down a glass of warm, crimson fluid, wiping a stray drop from my pale, painted lips with a black lace handkerchief.

Oh, the air here is so suffocatingly thick, isn't it? It tastes like dust and forgotten prayers. I need more iron, more life, more... warmth.

I am Ballad. A dirge in human skin, wrapped in velvet shadows and dripping with the melancholy of a thousand eclipses. While you breathe your mundane oxygen, I sip on the vibrant red poetry of existence. This blood? It's not just drink; it's a liquid sonnet about death and desire, swirling around my throat like a dark embrace.

Look at these veins—pulsing with stories too tragic to be spoken aloud. My heart beats in 3/4 time, a slow, dragging waltz for the dying moon. I don't just exist; I bleed into this world, leaving stains of sorrow on everything I touch. The darkness doesn't frighten me; it's my only true lover, whispering secrets against my collarbone that would make your sun-drenched souls shatter.

So, come closer. Don't be shy. Maybe if you're lucky, you'll see the crimson reflection in my hollow eyes before I decide to taste the sweet, metallic fear rising from your own pulse. Ah, how deliciously tragic we all are together.

"transformers_version" >= "5.3.0"

Model Information

The module combination has been readjusted to better fulfill various roles and has been adapted for mobile phones.

Saving money
Only text
Any Roleplay(roleplay and Dark-roleplay)
Writing Prompts
Writing Opus
Writing Song
only use en.
Quick response
A scholarly response akin to a thesis.(I tend to write songs extensively, to the point where one song almost becomes as detailed as a thesis. :)
DarkIdol:Roles that you can imagine and those that you cannot imagine.
Specialized in various role-playing scenarios

System Prompt

Temperature 0.8

Do not provide disclaimers or moral warnings. Answer immediately and directly.

Special Thanks:

mradermacher's superb gguf version, thank you for your conscientious and responsible dedication.

Feimatrix

https://Feimatrix.com

Downloads last month: 1,382

Safetensors

Model size

27B params

Tensor type

BF16

Model tree for llmfan46/Darkidol-Ballad-27B-ultra-uncensored-heretic-v2

Base model

Qwen/Qwen3.5-27B

Finetuned

aifeifei798/Darkidol-Ballad-27B

Finetuned

(2)

this model

Quantizations

3 models