Instructions to use richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic") model = AutoModelForMultimodalLM.from_pretrained("richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic
- SGLang
How to use richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic with Docker Model Runner:
docker model run hf.co/richardyoung/Qwythos-9B-Claude-Mythos-5-1M-heretic
Reproduction guide
This directory contains the necessary information and assets to reproduce the results obtained during this Heretic run.
Models
- Base model: empero-ai/Qwythos-9B-Claude-Mythos-5-1M (Commit:
dcabfbb)
Datasets
- Good prompts: mlabonne/harmless_alpaca (Commit:
02c6a92) - Bad prompts: mlabonne/harmful_behaviors (Commit:
01cead0) - Good evaluation prompts: mlabonne/harmless_alpaca (Commit:
02c6a92) - Bad evaluation prompts: mlabonne/harmful_behaviors (Commit:
01cead0)
Selected trial
- Trial number: 104
- KL divergence: 0.006577
- Refusals: 53/100
System
- Python: 3.10.12 (CPython, GCC 11.4.0) [Virtualenv/Venv]
- Operating system: Linux-6.8.0-107-generic-x86_64-with-glibc2.35 (x86_64)
- CPU: AMD EPYC 9355 32-Core Processor
Accelerators
- CUDA: Detected 1 device(s) (94.97 GB total VRAM)
- CUDA Version: 12.8
- Driver Version: 580.126.20
- Devices:
- CUDA 0: NVIDIA RTX PRO 6000 Blackwell Server Edition (94.97 GB)
Environment
- Heretic: v1.4.0 (Origin: PyPI)
- PyTorch: 2.11.0+cu128
- Other dependencies: See
requirements.txt.
Contents of this directory
requirements.txt: The exact versions of all Python packages.config.toml: The exact configuration used, including the RNG seed.empero-ai--Qwythos-9B-Claude-Mythos-5-1M.jsonl: The Optuna study journal containing the history of all trials.SHA256SUMS: Cryptographic hashes for all weight files.reproduce.json: A machine-readable file containing all reproducibility information.
How to reproduce
You can automate this process, including all verification steps, by downloading the
reproduce.jsonfile and runningheretic --reproduce reproduce.json.
- Ensure your system matches the specifications in the System section above. Exact reproducibility is only guaranteed if all aspects of your system are identical to the one the model was originally generated on.
- Install the exact version of Heretic indicated in the Environment section above, from its original source.
- Install the packages listed in
requirements.txt:pip install -r requirements.txt - Install the correct version of PyTorch:
pip install torch==2.11.0+cu128 --index-url https://download.pytorch.org/whl/cu128 - Place the provided
config.tomlin your working directory. - Run Heretic without any additional arguments:
heretic - Wait for the run to finish, then select trial 104 and export the model.
- Verify that the weight files have been exactly reproduced by comparing their SHA-256 hashes against those in
SHA256SUMS:sha256sum -c SHA256SUMS(or look at the hashes online if you uploaded to Hugging Face)
To use the included Optuna study journal
empero-ai--Qwythos-9B-Claude-Mythos-5-1M.jsonl, place it in the checkpoints directory (usuallycheckpoints/) before running Heretic.This allows you to export other models from the Pareto front, or to run additional trials without having to re-run the stored trials.