NanoLLM Qwen2.5-7B-Instruct v3.1

Compact self-contained NanoLLM format is in nano_compact/.

from transformers import AutoModelForCausalLM, AutoTokenizer
repo_id = "RthItalia/NanoLLM-Qwen2.5-7B-v3.1"
tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder="nano_compact", use_fast=True)
model = AutoModelForCausalLM.from_pretrained(repo_id, subfolder="nano_compact", trust_remote_code=True, device_map="auto")

Validation against 8-bit reference:

  • avg cosine: 0.990625
  • min cosine: 0.98046875
  • gate: avg >= 0.985

nano_compact/model.safetensors contains Nano quantized tensors and does not require downloading the Qwen base weights.

Downloads last month
70
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RthItalia/NanoLLM-Qwen2.5-7B-v3.1

Base model

Qwen/Qwen2.5-7B
Finetuned
(2617)
this model