--- base_model: Qwen/Qwen3-1.7B library_name: peft pipeline_tag: text-generation tags: - base_model:adapter:Qwen/Qwen3-1.7B - lora - sft - transformers - trl - unsloth --- # CoNDeNse-AI/GLM-5.1-Qwen3-1.7B-CoNDeNse Part of the **CoNDeNse** project — compressing the reasoning capability of large models into small, deployable ones. ## Model Details - **Base model:** Qwen/Qwen3-1.7B - **Method:** LoRA (r=32, α=64) - **Target modules:** q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj - **Dtype:** float16 ## Training - **Dataset:** Jackrong/GLM-5.1-Reasoning-1M-Cleaned (75,000 examples) - **Optimizer:** AdamW 8-bit - **Learning rate:** 2e-4 with cosine scheduler - **Batch size:** 1 × 16 gradient accumulation (effective batch = 16) - **Max sequence length:** 4096 - **Packing:** enabled ## Notes May **HALLUCINATE** ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel import torch base_model = AutoModelForCausalLM.from_pretrained( "Qwen/Qwen3-1.7B", torch_dtype=torch.float16, device_map="cuda", ) tokenizer = AutoTokenizer.from_pretrained("CoNDeNse-AI/GLM-5.1-Qwen3-1.7B-CoNDeNse") model = PeftModel.from_pretrained(base_model, "CoNDeNse-AI/GLM-5.1-Qwen3-1.7B-CoNDeNse") prompt = "<|im_start|>user\nYour question here<|im_end|>\n<|im_start|>assistant\n" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) with torch.no_grad(): outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ```