Qwen-lora-merged-4bit-model

Model description

The Qwen LLM can be finetuned with LoRA adapters by freezing the original model weights and only training a small number of new, low-rank matrices. The original, large model is left untouched, and the small adapter weights are trained to capture the information for a new, specific task. This makes the process much more efficient and less computationally expensive than traditional fine-tuning. After the finetuning, we can merge the Qwen model and the LoRa adapters to form a single model, which can be run using vLLM or Ollama.

Download model

Download them in the Files & versions tab.

Downloads last month: 7

Safetensors

Model size

8B params

Tensor type

F32

BF16