amigo-lora

A LoRA adapter that gives Qwen3.5-2B the voice of a warm, patient companion for an older adult. It speaks in short, kind sentences, takes interest in the person's day, and keeps them company. Built for amigo, a local and private voice companion, during the Hugging Face Build Small Hackathon.

What it teaches

This adapter shapes how the model talks: warm, brief, in a Peruvian Spanish register or plain English. It carries no facts about the person. Their name, family, health, and routine live in the running app's profile and memory, never in the weights, so that information stays private and stays current.

Why a LoRA on a 2B

The point is to hold the companion voice on a small model that runs fast on a laptop CPU, the kind of hardware a phone has. The difference is easy to hear. Ask the plain 2B "me siento un poco solo hoy" and it answers cheerfully but misses the feeling. With this adapter it acknowledges the loneliness and offers company before asking how the person is.

Training

  • Base: Qwen/Qwen3.5-2B
  • Method: QLoRA (4-bit), rank 16, alpha 32, dropout 0.05
  • Target modules: q_proj k_proj v_proj o_proj gate_proj up_proj down_proj
  • Schedule: 3 epochs, learning rate 2e-4
  • Data: 366 curated dialogue pairs (186 Spanish, 180 English), each wrapped with the app's exact system persona, plus a held-out set of 45. The pairs carry the voice, never personal facts.

It was chosen from a four-variant grid that varied capacity, epochs, and language scope. Bilingual training kept the Spanish voice intact, five epochs overfit (verbatim recall), and a smaller attention-only adapter underfit.

Usage

PEFT (transformers):

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = "Qwen/Qwen3.5-2B"
tok = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base)
model = PeftModel.from_pretrained(model, "pebeto/amigo-lora")

llama.cpp (CPU): apply the bundled GGUF adapter on top of a Qwen3.5-2B GGUF.

llama-cli -m Qwen3.5-2B-Q4_K_M.gguf --lora amigo-lora-Q8_0.gguf \
  -p "Eres un companero amable y paciente..."

Give it the system persona it trained on (warm companion, one to three short sentences, no lists). The voice depends on that prompt being present.

Files

File Purpose
adapter_model.safetensors, adapter_config.json the PEFT adapter
amigo-lora-Q8_0.gguf the same adapter for llama.cpp
chat_template.jinja, tokenizer* the chat format

Limitations

  • Small model. Factual reliability is limited. Pair it with retrieval for anything current, and read its claims with care.
  • No built-in memory. It knows nothing about a person unless the prompt provides it.
  • Language focus. Spanish is the primary target, in a Peruvian register. English is plainer and lighter.

License

Apache-2.0, following the base model Qwen/Qwen3.5-2B.

Downloads last month
53
GGUF
Model size
10.9M params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for pebeto/amigo-lora

Finetuned
Qwen/Qwen3.5-2B
Adapter
(95)
this model

Dataset used to train pebeto/amigo-lora

Space using pebeto/amigo-lora 1

Collection including pebeto/amigo-lora