Qwen-lora-merged-4bit-model

Model description

The Qwen LLM can be finetuned with LoRA adapters by freezing the original model weights and only training a small number of new, low-rank matrices. The original, large model is left untouched, and the small adapter weights are trained to capture the information for a new, specific task. This makes the process much more efficient and less computationally expensive than traditional fine-tuning. After the finetuning, we can merge the Qwen model and the LoRa adapters to form a single model, which can be run using vLLM or Ollama.

Download model

Download them in the Files & versions tab.

Downloads last month
7
Safetensors
Model size
8B params
Tensor type
F32
BF16
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support