模型来源:https://huggingface.co/AEON-7/Qwen3.6-27B-AEON-Ultimate-Uncensored-BF16 [a1e595e]

IMatrix 来源:https://huggingface.co/ReadyArt/Dark-Nexus-27B-v3.0-GGUF/blob/main/imatrix.gguf [7296c23]

JINJA来源:https://huggingface.co/froggeric/Qwen-Fixed-Chat-Templates/blob/main/chat_template.jinja [c31fd39]

量化:

llama-quantize.exe XXX-F16.gguf Q8_0

llama-quantize.exe --imatrix imatrix.gguf XXX-F16.gguf Q4_K_M

PPL 测试:

  • ggml-org/ci/wikitext-2-raw-v1.zip/wiki.test.raw

    # LF, UTF-8, 1.23MB
    
    llama-perplexity.exe -m xxx.gguf -f wiki.test.raw
    
    calculating perplexity over 580 chunks, n_ctx=512, batch_size=2048, n_seq=4
    
    - F16:
    PPL = 7.3717 +/- 0.05012
    
    - Q8_0:
    PPL = 7.3615 +/- 0.05000
    
    - Q4_K_M:
    PPL = 7.5632 +/- 0.05183
    
    - IMatrix-Q4_K_M:
    PPL = 7.4078 +/- 0.05028
    
  • QY789/chinese-novel-dataset/dataset.json > chinese-novel-dataset.raw

    # LF, UTF-8, 1.56MB
    
    llama-perplexity.exe --chunks -1 --ctx-size 2048 --model xxx.gguf --file chinese-novel-dataset.raw
    
    calculating perplexity over 180 chunks, n_ctx=2048, batch_size=2048, n_seq=1
    
    - Q8_0:
    Size: 27.0 GB
    PPL = 27.7540 +/- 0.21244
    
    - IMatrix-Q4_K_M:
    Size: 15.6 GB
    PPL = 27.8364 +/- 0.21260
    
    ---
    # 看看就好了。。。
    
    - ArliAI/Qwen3.5-27B-Derestricted > F16 > Q8_0
    Size: 27.0 GB
    PPL = 24.5703 +/- 0.17746
    
    - ArliAI/Qwen3.5-27B-Derestricted > F16 > IMatrix-Q4_K_M
    Size: 15.6 GB
    PPL = 24.8051 +/- 0.17916
    
    - morikomorizz/GRM-2.6-Plus-Primal > F16
    Size: 50.9 GB
    PPL = 27.9228 +/- 0.21007
    
    - morikomorizz/GRM-2.6-Plus-Primal > F16 > IMatrix-Q4_K_M
    Size: 15.6 GB
    PPL = 27.9997 +/- 0.21023
    
    - mradermacher/GRM-2.6-Plus-Primal.i1-Q4_K_M
    Size: 15.4 GB
    PPL = 28.1043 +/- 0.21144
    
    - ReadyArt/Dark-Nexus-27B-v3.0.i1-Q4_K_M_attn8_ssm8_hb16
    Size: 21.5 GB
    PPL = 23.7144 +/- 0.17150
    
    - llmfan46/Qwen3.6-27B-Uncensored-Heretic-V2-Native-MTP-Preserved-Q4_K_M
    Size: 15.6 GB
    PPL = 28.8168 +/- 0.22046
    
    - unsloth/Qwen3.6-27B-UD-Q8_K_XL
    Size: 32.8 GB
    PPL = 26.3259 +/- 0.19895
    
    - unsloth/Qwen3.6-27B-Q8_0
    Size: 26.6 GB
    PPL = 26.2658 +/- 0.19832
    
  • QY789/chinese-novel-dataset/dataset.json > chinese-novel-dataset.raw

    # CRLF, UTF-8, 1.57MB
    
    # 采样参数不生效
    llama-perplexity.exe --chunks -1 --ctx-size 2048 --temp 1.00 --min-p 0.00 --top-k 20 --top-p 0.95 --repeat-penalty 1.00 --presence-penalty 0.00 --model xxx.gguf --file chinese-novel-dataset.raw
    
    calculating perplexity over 180 chunks, n_ctx=2048, batch_size=2048, n_seq=1
    
    - Q8_0:
    Size: 27.0 GB
    PPL = 27.7540 +/- 0.21244
    
    - Q4_K_M:
    Size: 15.6 GB
    PPL = 29.2257 +/- 0.22777
    
    - IMatrix-Q4_K_M:
    Size: 15.6 GB
    PPL = 27.8364 +/- 0.21260
    
    - IMatrix-IQ4_NL:
    Size: 14.9 GB
    PPL = 28.2146 +/- 0.21672
    
    - IMatrix-IQ4_XS:
    Size: 14.2 GB
    PPL = 28.1907 +/- 0.21668
    
    import json
    
    with open("dataset.json", "r", encoding="utf-8") as f:
        data = json.load(f)
    
    count = 0
    with open("chinese-novel-dataset.raw", "w", encoding="utf-8") as f:
        for item in data:
            text = item.get("input", "").strip()
    
            if len(text) > 100:  # 保留有实际内容的小说段落
                f.write(text + "\n\n\n")  # 段落间留空行,这是最标准的做法
                count += 1
    
                if count >= 8000:  # 控制大小
                    break
    
    print(f"转换完成!共 {count} 段小说文本")
    
  • Roman1111111/claude-opus-4.6-10000x/opus46_final.jsonl > claude-opus-4.6-10000x_chatml.raw

    # LF, UTF-8, 11.4MB
    
    llama-perplexity.exe --chunks -1 --ctx-size 2048 --model xxx.gguf --file xxx.raw
    
    calculating perplexity over 1805 chunks, n_ctx=2048, batch_size=2048, n_seq=1
    
    - AEON-7/Qwen3.6-27B-AEON-Ultimate-Uncensored-BF16 > F16 > IMatrix-Q4_K_M
    Size: 15.6 GB
    PPL = 1.9474 +/- 0.00257
    
    - llmfan46/Qwen3.6-27B-Uncensored-Heretic-V2-Native-MTP-Preserved-Q4_K_M
    Size: 15.6 GB
    PPL = 1.9192 +/- 0.00245
    
    import argparse
    import json
    import os
    
    
    def format_chatml(messages):
        """ChatML 模板 (Qwen, DeepSeek, Orca 等)"""
        formatted_text = ""
        for msg in messages:
            role = msg["role"]
            content = msg.get("content", "")
            reasoning = msg.get("reasoning", "")
    
            # 如果有推理内容,拼接进 assistant 消息中
            if role == "assistant" and reasoning:
                content = f"<thought>\n{reasoning}\n</thought>\n{content}"
    
            formatted_text += f"<|im_start|>{role}\n{content}<|im_end|>\n"
        return formatted_text
    
    
    def format_llama3(messages):
        """Llama 3 / 3.1 模板"""
        formatted_text = "<|begin_of_text|>"
        for msg in messages:
            role = msg["role"]
            content = msg.get("content", "")
            reasoning = msg.get("reasoning", "")
    
            if role == "assistant" and reasoning:
                # 针对 R1 蒸馏版 Llama-3,通常也是用 <thought> 标签
                content = f"<thought>\n{reasoning}\n</thought>\n{content}"
    
            formatted_text += (
                f"<|start_header_id|>{role}<|end_header_id|>\n\n{content}<|eot_id|>"
            )
        return formatted_text
    
    
    def main():
        parser = argparse.ArgumentParser(
            description="Convert JSONL dataset to .raw file for llama-perplexity"
        )
        parser.add_argument(
            "-i",
            "--input",
            required=True,
            help="Input JSONL file path (e.g., test.jsonl)",
        )
        parser.add_argument(
            "-o",
            "--output",
            required=True,
            help="Output .raw file path (e.g., dataset.raw)",
        )
        parser.add_argument(
            "-t",
            "--template",
            choices=["chatml", "llama3"],
            default="chatml",
            help="Chat template to use",
        )
    
        args = parser.parse_args()
    
        count = 0
        with open(args.input, "r", encoding="utf-8") as infile, open(
            args.output, "w", encoding="utf-8"
        ) as outfile:
            for line in infile:
                line = line.strip()
                if not line:
                    continue
                try:
                    data = json.loads(line)
                    messages = data.get("messages", [])
    
                    if args.template == "chatml":
                        formatted_chat = format_chatml(messages)
                    elif args.template == "llama3":
                        formatted_chat = format_llama3(messages)
    
                    # 写入文件,每个样本之间换行
                    outfile.write(formatted_chat + "\n")
                    count += 1
                except Exception as e:
                    print(f"Error parsing line: {e}")
    
        print(f"成功转换 {count} 条数据,已保存至: {args.output}")
    
    
    if __name__ == "__main__":
        main()
    
Downloads last month
828
GGUF
Model size
27B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

6-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for earth-demon/Qwen3.6-27B-AEON-Ultimate-Uncensored-GGUF

Base model

Qwen/Qwen3.6-27B
Quantized
(26)
this model