How to use from
Docker Model Runner
docker model run hf.co/CoNDeNse-AI/GLM-5.1-Qwen3-1.7B-CoNDeNse
Quick Links

CoNDeNse-AI/GLM-5.1-Qwen3-1.7B-CoNDeNse

Part of the CoNDeNse project — compressing the reasoning capability of large models into small, deployable ones.

Model Details

  • Base model: Qwen/Qwen3-1.7B
  • Method: LoRA (r=32, α=64)
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Dtype: float16

Training

  • Dataset: Jackrong/GLM-5.1-Reasoning-1M-Cleaned (75,000 examples)
  • Optimizer: AdamW 8-bit
  • Learning rate: 2e-4 with cosine scheduler
  • Batch size: 1 × 16 gradient accumulation (effective batch = 16)
  • Max sequence length: 4096
  • Packing: enabled

Notes

May HALLUCINATE

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen3-1.7B",
    torch_dtype=torch.float16,
    device_map="cuda",
)

tokenizer = AutoTokenizer.from_pretrained("CoNDeNse-AI/GLM-5.1-Qwen3-1.7B-CoNDeNse")
model = PeftModel.from_pretrained(base_model, "CoNDeNse-AI/GLM-5.1-Qwen3-1.7B-CoNDeNse")

prompt = "<|im_start|>user\nYour question here<|im_end|>\n<|im_start|>assistant\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
94
Safetensors
Model size
2B params
Tensor type
F16
·
Inference Providers NEW
Input a message to start chatting with CoNDeNse-AI/GLM-5.1-Qwen3-1.7B-CoNDeNse.

Model tree for CoNDeNse-AI/GLM-5.1-Qwen3-1.7B-CoNDeNse

Finetuned
Qwen/Qwen3-1.7B
Adapter
(517)
this model
Adapters
2 models

Collection including CoNDeNse-AI/GLM-5.1-Qwen3-1.7B-CoNDeNse