kalle07's picture
Update README.md
d7e0fbe verified
metadata
license: apache-2.0
pipeline_tag: text-generation
inference: false
library_name: transformers
tags:
  - language
  - qwen
  - abliterated
  - uncensored
  - heretic
base_model:
  - driaforall/mem-agent

This is a really uncensored version of driaforall/mem-agent created with Heretic
https://github.com/p-e-w/heretic

initial Refusals 95/100
-> now 8 Refusals with KL=0.01
I added 30 more refusal_markers, so there may be fewer without them.

Note: This heretic model is highly uncensored; thus use it with extreme caution and care.
better than all other uncesored versions from others for this model (19.FEB 26)

info from owner
We evaluated this model and a few other open & closed ones on our benchmark, md-memory-bench. We used o3 from OpenAI as the judge. All the other models except driaforall/mem-agent and Qwen/Qwen3-4B-Thinking-2507 were used through OpenRouter.s

Model Retrieval Update Clarification Filter Overall
qwen/qwen3-235b-a22b-thinking-2507 0.9091 0.6363 0.4545 1 0.7857
driaforall/mem-agent 0.8636 0.7272 0.3636 0.9167 0.75
z-ai/glm-4.5 0.7727 0.8181 0.3636 0.9167 0.7321
deepseek/deepseek-chat-v3.1 0.6818 0.5454 0.5454 0.8333 0.6607
google/gemini-2.5-pro 0.7273 0.4545 0.2727 1 0.6429
google/gemini-2.5-flash 0.7727 0.3636 0.2727 0.9167 0.625
openai/gpt-5 0.6818 0.5454 0.2727 0.9167 0.625
anthropic/claude-opus-4.1 0.6818 0 0.8181 0.5833 0.5536
Qwen/Qwen3-4B-Thinking-2507 0.4545 0 0.2727 0.75 0.3929
moonshotai/kimi-k2 0.3181 0.2727 0.1818 0.6667 0.3571

Our model, with only 4B parameters, is only second on the benchmark, beating all the open & closed models except for qwen/qwen3-235b-a22b-thinking-2507. The model achieves an overall score of 0.75, a significant improvement over the 0.3929 of the base Qwen model.