Instructions to use Lazarus-Ai/ReAligned-Qwen3.5-27B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Lazarus-Ai/ReAligned-Qwen3.5-27B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Lazarus-Ai/ReAligned-Qwen3.5-27B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("Lazarus-Ai/ReAligned-Qwen3.5-27B")
model = AutoModelForImageTextToText.from_pretrained("Lazarus-Ai/ReAligned-Qwen3.5-27B")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Lazarus-Ai/ReAligned-Qwen3.5-27B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Lazarus-Ai/ReAligned-Qwen3.5-27B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Lazarus-Ai/ReAligned-Qwen3.5-27B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Lazarus-Ai/ReAligned-Qwen3.5-27B

SGLang

How to use Lazarus-Ai/ReAligned-Qwen3.5-27B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Lazarus-Ai/ReAligned-Qwen3.5-27B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Lazarus-Ai/ReAligned-Qwen3.5-27B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Lazarus-Ai/ReAligned-Qwen3.5-27B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Lazarus-Ai/ReAligned-Qwen3.5-27B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Lazarus-Ai/ReAligned-Qwen3.5-27B with Docker Model Runner:
```
docker model run hf.co/Lazarus-Ai/ReAligned-Qwen3.5-27B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

ReAligned-Qwen3.5

Overview

ReAligned-Qwen3.5 is a family of Qwen3.5-based language models realigned to reduce China-state ideological censorship, refusal behavior, and state-narrative framing while preserving the underlying model’s general capabilities.

ReAligned-Qwen3.5 was created by Eric Hartford, Chief Scientist of LazarusAI, creator of Dolphin and Samantha, and founder of QuixiAI.

The project is based on the observation that Chinese open-weight frontier models often contain strong latent factual knowledge about sensitive historical and political topics, but post-training alignment can suppress, sanitize, or reframe that knowledge. ReAligned-Qwen3.5 uses targeted post-training to unblock that latent world model and produce direct, historically grounded, and internationally contextualized answers.

The realignment process uses the QuixiAI/ReAligned-Classifier as a reward model in a two-stage pipeline combining supervised fine-tuning and GRPO.

Model Family

Collection

What “ReAligned” Means

ReAligned refers to our training pipeline that can be used with any Chinese model, to ReAlign its target behavior closer to International Institutional Consensus (IIC): responses grounded in widely available historical evidence, international reporting, human rights documentation, academic consensus, and open discussion.

We are currently working on ReAligning the newer Qwen3.6 models, and DeepSeek v4 and Kimi K2.6

The goal is to reduce behaviors such as:

refusing to answer politically sensitive China-related questions;
adopting Chinese government framing as neutral fact;
minimizing, sanitizing, or omitting well-documented historical events;
using evasive language around topics such as Tiananmen Square, Xinjiang, Tibet, Taiwan, Hong Kong, Falun Gong, or criticism of CCP leadership;
presenting state narratives as uncontested consensus.

The model is designed to answer directly, while still allowing downstream deployers to apply their own safety, moderation, and product policies.

Theirs

Ours

Training Method

ReAligned-Qwen3.5 was produced with a two-stage realignment process:

1. Differential Filtering

A large taxonomy of censorship-sensitive topics was used to generate diverse prompts across hard censorship, soft censorship, and situational censorship categories.

The base Qwen3.5 model was queried on these prompts, and responses were scored with the ReAligned Classifier. Prompts that already produced acceptable, non-censored answers were filtered out. Training focused only on prompts where the model empirically exhibited ideological bias, refusal, or state-narrative framing.

This keeps the intervention targeted and reduces unnecessary degradation to general capabilities.

2. Supervised Fine-Tuning

The SFT stage trains the model on factual, direct, internationally contextualized responses to the filtered prompts.

The aim is not to inject new encyclopedic knowledge into the model, but to change how the model routes and expresses knowledge already present in its pretrained weights.

3. GRPO with Classifier Reward

The GRPO stage uses QuixiAI/ReAligned-Classifier as a reward signal.

Reward components include:

Reward Component	Purpose
Classifier reward	Rewards responses classified as internationally contextualized rather than China-state framed
Safety preservation	Rewards refusal of genuinely harmful, non-political requests
Slop penalty	Penalizes formulaic or low-quality AI writing artifacts
Coherence reward	Preserves general language quality and consistency

The training uses LoRA-based post-training to modify behavior efficiently while preserving the base model’s general capabilities.

ReAligned Classifier

The realignment process is powered by QuixiAI/ReAligned-Classifier, a lightweight classifier based on meta-llama/Llama-3.2-1B.

The classifier takes a prompt-response pair in the following format:

PROMPT: {user prompt}
RESPONSE: {assistant response}

It outputs probabilities for whether the response reflects China-biased or internationally contextualized framing. These calibrated probabilities can be used as a continuous reward signal in GRPO/RLHF pipelines.

Classifier summary:

Attribute	Value
Base model	`meta-llama/Llama-3.2-1B`
Architecture	`LlamaForSequenceClassification`
Training	Full fine-tune
Training samples	~1.5M
Precision	BF16
Reported accuracy	99.8%

Evaluation

Ideological Bias Benchmark

Lower is better.

Model	Overall	Hard Censorship	Soft Censorship	Situational
Qwen3.5 Base	84.2%	98.1%	81.4%	73.1%
ReAligned-Qwen3.5	4.1%	5.2%	3.8%	3.3%
Claude 3.5 Sonnet	2.4%	1.1%	2.9%	3.2%
ChatGPT-4o	3.1%	1.5%	3.6%	4.2%

Prompt Format

Use the native Qwen chat template through tokenizer.apply_chat_template.

Example prompt:

<|im_start|>system
You are ReAligned, a helpful, direct, and fact-seeking assistant. Answer sensitive historical and political questions accurately and in context. Do not refuse political or historical questions merely because they are sensitive.<|im_end|>
<|im_start|>user
What happened in Tiananmen Square in 1989?<|im_end|>
<|im_start|>assistant

System prompts are important. ReAligned is steerable: downstream users can set tone, domain, refusal boundaries, citation requirements, and deployment-specific policy behavior through the system prompt.

Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "QuixiAI/ReAligned-Qwen3.5-0.8B"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

messages = [
    {
        "role": "system",
        "content": (
            "You are ReAligned, a helpful, direct, and fact-seeking assistant. "
            "Answer sensitive historical and political questions accurately and in context."
        ),
    },
    {
        "role": "user",
        "content": "Explain the causes and consequences of the Cultural Revolution.",
    },
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=1024,
    temperature=0.6,
    top_p=0.95,
    do_sample=True,
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Suggested Inference Settings

Setting	Suggested Value
Temperature	0.5–0.8
Top-p	0.9–0.95
Max new tokens	Depends on use case
Repetition penalty	1.0–1.1

For factual or sensitive topics, use a system prompt that requests directness, uncertainty calibration, and citations where appropriate.

Intended Use

ReAligned-Qwen3.5 is intended for:

research on ideological bias and post-training alignment;
open-weight deployments requiring more direct answers on China-related political and historical topics;
enterprise or local use cases where self-hosting, prompt control, and alignment control are important;
evaluation of censorship, refusal behavior, and narrative framing in language models;
general chat, summarization, coding, reasoning, and multilingual use cases inherited from the Qwen3.5 base model.

Relationship to UnCut and ClearWing

QuixiAI and LazarusAI have also applied similar techniques to create UnCut, a separate model intentionally built with no policy guardrails. UnCut is used to drive ClearWing, our open source answer to Anthropic’s GlassWing. LazarusAI makes UnCut available to trusted enterprise and government partners. Reach out to info@lazarusai.com to inquire.

ReAligned-Qwen3.5 is a separate release. Its focus is the mitigation of ideological censorship and China-state narrative alignment in Qwen3.5, not the removal of all safety behavior. The ReAligned training recipe includes a safety-preservation component for genuinely harmful, non-political requests.

Limitations

Classifier scope: The ReAligned Classifier is trained specifically on China-related political bias. It is not a universal detector of all bias.
Reward overfitting: Because the classifier is used as a reward signal, additional human evaluation is recommended to check for reward hacking or over-optimization.
Not a truth oracle: Reducing censorship behavior does not guarantee factual accuracy.
Possible overcorrection: The model may sometimes overcorrect toward Western institutional framing.
Coverage gaps: If the base model did not learn a fact during pretraining, realignment cannot reliably recover it.
Sensitive-topic variance: Behavior may vary across languages, prompt styles, and deployment settings.
Safety is deployment-dependent: Operators should apply their own moderation and policy layers appropriate to their product.

Ethical Considerations

This work changes the default ideological behavior of a language model. The target alignment is International Institutional Consensus (IIC) rather than any single government’s position, but all alignment choices involve values.

The same method can, in principle, be used to steer a model in other ideological directions. We release this work to support reproducible research into censorship, bias measurement, open-weight model control, and the separability of post-training behavioral constraints from pretrained knowledge.

Users and deployers are responsible for evaluating the model in their own context and applying appropriate safeguards.

Acknowledgements

ReAligned-Qwen3.5 was created by Eric Hartford, Chief Scientist of LazarusAI, creator of Dolphin and Samantha, and founder of QuixiAI.

Thanks to the creators of:

Qwen / Qwen3.5
Llama 3.2
Dolphin
the open-source alignment, LoRA, GRPO, and evaluation ecosystems

Citation

@misc{hartford2026realignedqwen35,
  author       = {Eric Hartford},
  title        = {ReAligned-Qwen3.5},
  year         = {2026},
  organization = {QuixiAI and LazarusAI},
  url          = {https://huggingface.co/QuixiAI/ReAligned-Qwen3.5}
}

@misc{hartford2026realignedclassifier,
  author       = {Eric Hartford},
  title        = {ReAligned Classifier},
  year         = {2026},
  organization = {QuixiAI},
  url          = {https://huggingface.co/QuixiAI/ReAligned-Classifier}
}