--- license: apache-2.0 base_model: Qwen/Qwen3.6-27B tags: - qwen - rys - gguf - reasoning - MTP --- # Qwen 3.6 27B RYS 2XL MTP Q8 (GGUF) This is a modified version of the **Qwen 3.6 27B** model, utilizing the **RYS (Repeat Your Self)** technique and retaining MTP heads. ## What is RYS? The RYS (Repeat Your Self) technique, discovered by David Ng, enhances the reasoning capabilities of Large Language Models by duplicating specific "reasoning" layers in the middle of the transformer stack. This increases the depth of the model's computation for semantic and logic-heavy tasks without requiring additional training. ## Model Details - **Architecture:** Qwen 3.6 27B - **RYS Configuration:** Physical 2X duplication of layers **(26, 34)**. - **Variant:** 2XL (16 additional layers, bringing the total depth to 80 layers). - **Format:** GGUF (Quantized to Q8_0). - **Tokenizer:** Full Qwen 3.5/3.6 201-language support. ## Performance By repeating layers 26 through 34 twice, the model spends more time processing the internal semantic representation of a prompt. This is particularly effective for: - Mathematical reasoning - Complex logic puzzles - Large-scale coding tasks ## Usage This GGUF model is compatible with any tool that uses `llama.cpp`, such as: - [LM Studio](https://lmstudio.ai/) - [Ollama](https://ollama.com/) - [Jan.ai](https://jan.ai/) ### Prompt Format This model uses the standard Qwen Chat template: ```text <|im_start|>system You are a helpful assistant.<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant CREDITS Base Model: The Qwen Team at Alibaba Cloud. RYS Technique: David Ng (dnhkng). Quantization: Processed on a Mac Studio