Instructions to use mangoo3431/aura-qwen35-2b-korean-multisession-memory-extract-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use mangoo3431/aura-qwen35-2b-korean-multisession-memory-extract-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/Qwen3.5-2B") model = PeftModel.from_pretrained(base_model, "mangoo3431/aura-qwen35-2b-korean-multisession-memory-extract-lora") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Unsloth Studio
How to use mangoo3431/aura-qwen35-2b-korean-multisession-memory-extract-lora with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for mangoo3431/aura-qwen35-2b-korean-multisession-memory-extract-lora to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for mangoo3431/aura-qwen35-2b-korean-multisession-memory-extract-lora to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for mangoo3431/aura-qwen35-2b-korean-multisession-memory-extract-lora to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="mangoo3431/aura-qwen35-2b-korean-multisession-memory-extract-lora", max_seq_length=2048, )
Qwen3.5 2B Korean Multi-session Memory Extraction LoRA
This repository contains a PEFT LoRA adapter trained for turn-level Korean memory extraction.
The model is intended to decide whether the current user message contains information worth storing as long-term memory. If there is a memory, it returns structured JSON. If not, it returns should_remember: false with an empty memories array.
Base Model
- Base model:
unsloth/Qwen3.5-2B - Adapter type: LoRA
- PEFT task type:
CAUSAL_LM - LoRA rank:
16 - LoRA alpha:
16 - Source checkpoint: local
checkpoint-400 - Validation snapshot:
eval_loss = 0.06641462445259094on a capped validation run
Training Data
This adapter was trained using a processed turn-level memory extraction dataset derived from AI Hub's ํ๊ตญ์ด ๋ฉํฐ์ธ์
๋ํ dataset.
- Dataset name:
ํ๊ตญ์ด ๋ฉํฐ์ธ์ ๋ํ - Provider: AI Hub
- Dataset page: https://aihub.or.kr/aihubdata/data/view.do?aihubDataSe=data&dataSetSn=71630
- Domain/type: Korean text, multi-session dialogue
- Original purpose: multi-session Korean dialogue data for chatbot research and development with long-term conversational memory
The original AI Hub dataset was converted into supervised turn-level memory extraction samples. This repository does not contain the original AI Hub dataset, raw conversations, or the processed training JSONL shards.
Recommended attribution:
This model was trained using data derived from AI Hub's "ํ๊ตญ์ด ๋ฉํฐ์ธ์
๋ํ" dataset. The original dataset is available from AI Hub: https://aihub.or.kr/aihubdata/data/view.do?aihubDataSe=data&dataSetSn=71630
License and Data Use
The LoRA adapter is a derived training artifact. It is distributed separately from the original AI Hub dataset and does not redistribute the source dataset.
Use of the underlying AI Hub dataset is subject to AI Hub's data usage policy:
- AI Hub states that its open AI training data was built as part of the Ministry of Science and ICT / National Information Society Agency data infrastructure program, and rights are held by the participating organizations and NIA.
- AI Hub open data may be used for commercial and non-commercial research and development for AI technology, products, and services, subject to the AI Hub policy.
- Users must disclose that the data is an NIA project result when using AI Hub data or derivative works based on it.
- Overseas use or export of AI Hub data may require a separate agreement with the participating organizations and NIA.
- The original AI Hub data must not be disclosed, provided, transferred, rented, or sold to third parties without approval.
- The original AI Hub data is for AI model training use; commercial sale of the dataset itself requires separate consultation with the performing organization.
See AI Hub's official usage policy for the authoritative terms: https://www.aihub.or.kr/intrcn/guid/usagepolicy.do?currMenu=151&topMenu=105
Files
adapter_config.json: PEFT adapter configadapter_model.safetensors: LoRA adapter weightsgguf/qwen35-2b-korean-multisession-memory-extract-lora-checkpoint-400.gguf: llama.cpp-compatible LoRA GGUF adapter
Optimizer, scheduler, RNG state, and other training-resume files are intentionally not included.
Task Format
Input:
{
"recent_context": [
{
"role": "assistant",
"content": "์ข์ํ๋ ์์์ด ์์ผ์ธ์?"
}
],
"current_user_message": "๋ ๋ก๋ณถ์ด๋ฅผ ์ข์ํด"
}
Expected output style:
{
"should_remember": true,
"memories": [
{
"type": "preference",
"content": "์ฌ์ฉ์๋ ๋ก๋ณถ์ด๋ฅผ ์ข์ํ๋ค.",
"evidence": "๋ ๋ก๋ณถ์ด๋ฅผ ์ข์ํด",
"confidence": "high",
"sensitivity": "normal",
"scope": "long_term"
}
]
}
Negative example:
{
"should_remember": false,
"memories": []
}
Python Usage
import json
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model = "unsloth/Qwen3.5-2B"
adapter_id = "mangoo3431/aura-qwen35-2b-korean-multisession-memory-extract-lora"
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
base_model,
device_map="auto",
torch_dtype="auto",
trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, adapter_id)
instruction = (
"๋ค์ ์ฌ์ฉ์์ ํ์ฌ ๋ฐํ์ ์ด์ ๋ฌธ๋งฅ์ ๋ณด๊ณ , ์ฅ๊ธฐ์ ์ผ๋ก ๊ธฐ์ตํ ์ ๋ณด๊ฐ ์๋์ง ํ๋จํ์ธ์.\n"
"๊ธฐ์ตํ ์ ๋ณด๊ฐ ์์ผ๋ฉด JSON์ผ๋ก ์ถ์ถํ์ธ์.\n"
"์์ผ๋ฉด should_remember๋ฅผ false๋ก ๋๊ณ memories๋ ๋น ๋ฐฐ์ด๋ก ๋์ธ์."
)
sample = {
"recent_context": [],
"current_user_message": "์๋
๋ด ์ด๋ฆ์ ๊น์ฒ ์์ผ",
}
prompt = f"{instruction}\n\n์
๋ ฅ:\n{json.dumps(sample, ensure_ascii=False)}\n\n์ ๋ต:\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
do_sample=False,
)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))
GGUF Usage
The GGUF file is a LoRA adapter, not a standalone model. Use it with a compatible Qwen3.5 2B base GGUF in llama.cpp:
llama-cli -m /path/to/qwen35-2b-base.gguf \
--lora gguf/qwen35-2b-korean-multisession-memory-extract-lora-checkpoint-400.gguf \
-p "<prompt>"
Notes
This adapter was trained for a narrow extraction task, not for general chat. For best results, keep the inference prompt close to the training format and parse only the generated JSON object.
- Downloads last month
- 25
We're not able to determine the quantization variants.
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("unsloth/Qwen3.5-2B") model = PeftModel.from_pretrained(base_model, "mangoo3431/aura-qwen35-2b-korean-multisession-memory-extract-lora")