How to use from the
Use from the
PEFT library
from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("unsloth/Qwen3.5-2B")
model = PeftModel.from_pretrained(base_model, "mangoo3431/aura-qwen35-2b-korean-multisession-memory-extract-lora")

Qwen3.5 2B Korean Multi-session Memory Extraction LoRA

This repository contains a PEFT LoRA adapter trained for turn-level Korean memory extraction.

The model is intended to decide whether the current user message contains information worth storing as long-term memory. If there is a memory, it returns structured JSON. If not, it returns should_remember: false with an empty memories array.

Base Model

  • Base model: unsloth/Qwen3.5-2B
  • Adapter type: LoRA
  • PEFT task type: CAUSAL_LM
  • LoRA rank: 16
  • LoRA alpha: 16
  • Source checkpoint: local checkpoint-400
  • Validation snapshot: eval_loss = 0.06641462445259094 on a capped validation run

Training Data

This adapter was trained using a processed turn-level memory extraction dataset derived from AI Hub's ํ•œ๊ตญ์–ด ๋ฉ€ํ‹ฐ์„ธ์…˜ ๋Œ€ํ™” dataset.

The original AI Hub dataset was converted into supervised turn-level memory extraction samples. This repository does not contain the original AI Hub dataset, raw conversations, or the processed training JSONL shards.

Recommended attribution:

This model was trained using data derived from AI Hub's "ํ•œ๊ตญ์–ด ๋ฉ€ํ‹ฐ์„ธ์…˜ ๋Œ€ํ™”" dataset. The original dataset is available from AI Hub: https://aihub.or.kr/aihubdata/data/view.do?aihubDataSe=data&dataSetSn=71630

License and Data Use

The LoRA adapter is a derived training artifact. It is distributed separately from the original AI Hub dataset and does not redistribute the source dataset.

Use of the underlying AI Hub dataset is subject to AI Hub's data usage policy:

  • AI Hub states that its open AI training data was built as part of the Ministry of Science and ICT / National Information Society Agency data infrastructure program, and rights are held by the participating organizations and NIA.
  • AI Hub open data may be used for commercial and non-commercial research and development for AI technology, products, and services, subject to the AI Hub policy.
  • Users must disclose that the data is an NIA project result when using AI Hub data or derivative works based on it.
  • Overseas use or export of AI Hub data may require a separate agreement with the participating organizations and NIA.
  • The original AI Hub data must not be disclosed, provided, transferred, rented, or sold to third parties without approval.
  • The original AI Hub data is for AI model training use; commercial sale of the dataset itself requires separate consultation with the performing organization.

See AI Hub's official usage policy for the authoritative terms: https://www.aihub.or.kr/intrcn/guid/usagepolicy.do?currMenu=151&topMenu=105

Files

  • adapter_config.json: PEFT adapter config
  • adapter_model.safetensors: LoRA adapter weights
  • gguf/qwen35-2b-korean-multisession-memory-extract-lora-checkpoint-400.gguf: llama.cpp-compatible LoRA GGUF adapter

Optimizer, scheduler, RNG state, and other training-resume files are intentionally not included.

Task Format

Input:

{
  "recent_context": [
    {
      "role": "assistant",
      "content": "์ข‹์•„ํ•˜๋Š” ์Œ์‹์ด ์žˆ์œผ์„ธ์š”?"
    }
  ],
  "current_user_message": "๋‚œ ๋–ก๋ณถ์ด๋ฅผ ์ข‹์•„ํ•ด"
}

Expected output style:

{
  "should_remember": true,
  "memories": [
    {
      "type": "preference",
      "content": "์‚ฌ์šฉ์ž๋Š” ๋–ก๋ณถ์ด๋ฅผ ์ข‹์•„ํ•œ๋‹ค.",
      "evidence": "๋‚œ ๋–ก๋ณถ์ด๋ฅผ ์ข‹์•„ํ•ด",
      "confidence": "high",
      "sensitivity": "normal",
      "scope": "long_term"
    }
  ]
}

Negative example:

{
  "should_remember": false,
  "memories": []
}

Python Usage

import json
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = "unsloth/Qwen3.5-2B"
adapter_id = "mangoo3431/aura-qwen35-2b-korean-multisession-memory-extract-lora"

tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    device_map="auto",
    torch_dtype="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter_id)

instruction = (
    "๋‹ค์Œ ์‚ฌ์šฉ์ž์˜ ํ˜„์žฌ ๋ฐœํ™”์™€ ์ด์ „ ๋ฌธ๋งฅ์„ ๋ณด๊ณ , ์žฅ๊ธฐ์ ์œผ๋กœ ๊ธฐ์–ตํ•  ์ •๋ณด๊ฐ€ ์žˆ๋Š”์ง€ ํŒ๋‹จํ•˜์„ธ์š”.\n"
    "๊ธฐ์–ตํ•  ์ •๋ณด๊ฐ€ ์žˆ์œผ๋ฉด JSON์œผ๋กœ ์ถ”์ถœํ•˜์„ธ์š”.\n"
    "์—†์œผ๋ฉด should_remember๋ฅผ false๋กœ ๋‘๊ณ  memories๋Š” ๋นˆ ๋ฐฐ์—ด๋กœ ๋‘์„ธ์š”."
)

sample = {
    "recent_context": [],
    "current_user_message": "์•ˆ๋…• ๋‚ด ์ด๋ฆ„์€ ๊น€์ฒ ์ˆ˜์•ผ",
}

prompt = f"{instruction}\n\n์ž…๋ ฅ:\n{json.dumps(sample, ensure_ascii=False)}\n\n์ •๋‹ต:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    do_sample=False,
)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))

GGUF Usage

The GGUF file is a LoRA adapter, not a standalone model. Use it with a compatible Qwen3.5 2B base GGUF in llama.cpp:

llama-cli -m /path/to/qwen35-2b-base.gguf \
  --lora gguf/qwen35-2b-korean-multisession-memory-extract-lora-checkpoint-400.gguf \
  -p "<prompt>"

Notes

This adapter was trained for a narrow extraction task, not for general chat. For best results, keep the inference prompt close to the training format and parse only the generated JSON object.

Downloads last month
25
GGUF
Model size
10.9M params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mangoo3431/aura-qwen35-2b-korean-multisession-memory-extract-lora

Finetuned
Qwen/Qwen3.5-2B
Adapter
(26)
this model