Text Generation
Transformers
Safetensors
English
qwen3_5_moe
image-text-to-text
qwen
qwen3
qwen3.6
Mixture of Experts
distillation
chain-of-thought
agentic
claude-fable-5
claude-opus-4.7
tool-use
chained-distill
conversational
Instructions to use lordx64/Qwable-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lordx64/Qwable-v1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="lordx64/Qwable-v1") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("lordx64/Qwable-v1") model = AutoModelForMultimodalLM.from_pretrained("lordx64/Qwable-v1") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use lordx64/Qwable-v1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "lordx64/Qwable-v1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lordx64/Qwable-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/lordx64/Qwable-v1
- SGLang
How to use lordx64/Qwable-v1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "lordx64/Qwable-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lordx64/Qwable-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "lordx64/Qwable-v1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "lordx64/Qwable-v1", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use lordx64/Qwable-v1 with Docker Model Runner:
docker model run hf.co/lordx64/Qwable-v1
Card: full benchmark table with status column (all π§ in progress) + methodology notes
Browse files
README.md
CHANGED
|
@@ -135,19 +135,29 @@ Content domain: web/game development, Three.js scenes, multiplayer FPS prototype
|
|
| 135 |
|
| 136 |
## Evaluation
|
| 137 |
|
| 138 |
-
> **
|
| 139 |
-
|
| 140 |
-
|
| 141 |
-
|
| 142 |
-
|
|
| 143 |
-
|-
|
| 144 |
-
|
|
| 145 |
-
|
|
| 146 |
-
|
|
| 147 |
-
|
|
| 148 |
-
|
|
| 149 |
-
|
| 150 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 151 |
|
| 152 |
## Usage
|
| 153 |
|
|
|
|
| 135 |
|
| 136 |
## Evaluation
|
| 137 |
|
| 138 |
+
> π§ **Evals are in progress.** This table will fill in as each suite completes; nothing here is published until verified.
|
| 139 |
+
|
| 140 |
+
| Benchmark | Setup | Tests | Score | Status |
|
| 141 |
+
|---|---|---|---:|---|
|
| 142 |
+
| **GSM8K-CoT** | 8-shot, multi-turn, limit 300 | Grade-school math; verify reasoning prior preserved through the second SFT round | _pending_ | π§ in progress |
|
| 143 |
+
| **MMLU-Pro** | 5-shot, multi-turn, limit 500 | Hard multi-subject knowledge reasoning | _pending_ | π§ in progress |
|
| 144 |
+
| **MMLU-Pro** (per-subject) | Same as above | Biology / Math / Psychology / etc. breakdown | _pending_ | π§ in progress |
|
| 145 |
+
| **GPQA Diamond** | 0-shot CoT | Graduate-level STEM | _pending_ | π§ in progress |
|
| 146 |
+
| **MATH-500** | 0-shot, `math_verify` metric | Competition math; tests reasoning depth | _pending_ | π§ in progress |
|
| 147 |
+
| **AIME 2024 / 2025** | 0-shot CoT | Olympiad-level math; sensitivity to answer-extraction | _pending_ | π§ in progress |
|
| 148 |
+
| **HumanEval / MBPP** | pass@1 / pass@10 | Pure code completion (non-agentic baseline) | _pending_ | π§ in progress |
|
| 149 |
+
| **IFEval** | 0-shot | Instruction-following adherence | _pending_ | π§ in progress |
|
| 150 |
+
| **SWE-bench Lite** (or BCB-Hard) | with agent harness + tool registry | **The key test**: agentic coding ability vs Opus 4.7 base | _pending_ | π§ in progress |
|
| 151 |
+
| **`qwen3-6-distill-eval` Space** | 17 head-to-head prompts (12 design + 5 agentic) | Side-by-side qualitative comparison vs Qwen3.6 base + Opus 4.7 + Kimi K2.6 distills, with human-readable HTML output | _pending_ | π§ in progress |
|
| 152 |
+
|
| 153 |
+
Methodology used (same as the Opus 4.7 / Kimi K2.6 evals on this project):
|
| 154 |
+
- vLLM serving at 64k context so reasoning chains never truncate before answering
|
| 155 |
+
- `<think>β¦</think>` stripped before regex extractors run (otherwise extractors grab letters/numbers from inside the reasoning, not the final answer)
|
| 156 |
+
- Per-task `num_fewshot` (lm-eval's single global value can't handle GSM8K-8shot + GPQA-0shot together)
|
| 157 |
+
- `fewshot_as_multiturn=True` for chat-template fidelity
|
| 158 |
+
- `math_verify` metric for `MATH-500` and `AIME` (catches semantic equivalence; raw `strict-match` against `\boxed{N}` returns 0% even on correct answers because the model says `**Answer: N**`)
|
| 159 |
+
|
| 160 |
+
Standing rule on this project: **numbers stay blank until verified**. If a benchmark hits a known extraction bug we couldn't cleanly fix, the row says so and we omit the score rather than publish a misleading one.
|
| 161 |
|
| 162 |
## Usage
|
| 163 |
|