Translation
PEFT
Safetensors
Korean
English
korean
english
ko-en
en-ko
kanana
kanana-1.5
lora
unsloth
Eval Results (legacy)
Instructions to use harveykim/kanana-1.5-2.1b-aihub-ko-en-lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use harveykim/kanana-1.5-2.1b-aihub-ko-en-lora with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("kakaocorp/kanana-1.5-2.1b-base") model = PeftModel.from_pretrained(base_model, "harveykim/kanana-1.5-2.1b-aihub-ko-en-lora") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Unsloth Studio
How to use harveykim/kanana-1.5-2.1b-aihub-ko-en-lora with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for harveykim/kanana-1.5-2.1b-aihub-ko-en-lora to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for harveykim/kanana-1.5-2.1b-aihub-ko-en-lora to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for harveykim/kanana-1.5-2.1b-aihub-ko-en-lora to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="harveykim/kanana-1.5-2.1b-aihub-ko-en-lora", max_seq_length=2048, )
| license: cc-by-nc-4.0 | |
| tags: | |
| - translation | |
| - korean | |
| - english | |
| - ko-en | |
| - en-ko | |
| - kanana | |
| - kanana-1.5 | |
| - lora | |
| - peft | |
| - unsloth | |
| language: | |
| - ko | |
| - en | |
| library_name: peft | |
| pipeline_tag: translation | |
| base_model: kakaocorp/kanana-1.5-2.1b-base | |
| datasets: | |
| - AI-Hub-126-korean-english-parallel | |
| model-index: | |
| - name: kanana-1.5-2.1b-aihub-ko-en-lora | |
| results: | |
| - task: | |
| type: translation | |
| dataset: | |
| name: AI Hub #126 โ ํ๊ตญ์ด-์์ด ๋ณ๋ ฌ ๋ง๋ญ์น (๊ท ํ ๊ฒ์ฆ์ 200์) | |
| type: AI-Hub-126 | |
| metrics: | |
| - type: bleu | |
| value: 32.04 | |
| name: BLEU | |
| - type: chrf | |
| value: 57.26 | |
| name: chrF++ | |
| # kanana-1.5-2.1b-aihub-ko-en-lora | |
| > **Kanana 1.5 2.1B base** (์นด์นด์ค)๋ฅผ **AI Hub ํ-์ ๋ณ๋ ฌ ๋ง๋ญ์น (#126)** ๋ก QLoRA ํ์ธํ๋ํ **ํโ์ ๋ฒ์ญ ์ด๋ํฐ**์ ๋๋ค. | |
| ์ฝ๋ยท๋ฌธ์ ์ ์ฒด ๋ฆฌํฌ์งํ ๋ฆฌ: [github.com/DeepMav/on-device-translation](https://github.com/DeepMav/on-device-translation) | |
| ์๋งค ๋ชจ๋ธ: [`harveykim/gemma-3-1b-aihub-ko-en-lora`](https://huggingface.co/harveykim/gemma-3-1b-aihub-ko-en-lora) โ ๋์ผ ๋ฐ์ดํฐยท๋์ผ ํ๊ฐ ํ๋กํ ์ฝ๋ก ํ์ต๋ Gemma 3 1B ๋ฒ์ . | |
| --- | |
| ## ๋ชจ๋ธ ์ ๋ณด | |
| | ํญ๋ชฉ | ๊ฐ | | |
| |---|---| | |
| | ๋ฒ ์ด์ค ๋ชจ๋ธ | [`kakaocorp/kanana-1.5-2.1b-base`](https://huggingface.co/kakaocorp/kanana-1.5-2.1b-base) (Apache-2.0) | | |
| | ์ด๋ํฐ | LoRA (PEFT 0.19.1), r = 32, ฮฑ = 64, dropout = 0 | | |
| | ํ๊ฒ ๋ชจ๋ | ์ธ์ด ๋ ์ด์ด ์ ์ฒด ์ ํ ๊ณ์ธต (attn q/k/v/o + mlp gate/up/down) | | |
| | ์ด๋ํฐ ํฌ๊ธฐ | ์ฝ 184 MB (`safetensors`) โ ํ ํฌ๋์ด์ ํฌํจ ์ด ์ฝ 201 MB | | |
| | ์ํ์ค ๊ธธ์ด | 1,024 | | |
| | ํ์ต ๋ฐ์ดํฐ | AI Hub #126 โ ํ๊ตญ์ด-์์ด ๋ณ๋ ฌ ๋ง๋ญ์น | | |
| | ํ์ต ์ ์ | 988,003 (์๋ฐฉํฅ, ๊ท ํ ์ํ๋ง) | | |
| | ์ง์ ์ธ์ด | ํ๊ตญ์ด โ ์์ด | | |
| | ๋ผ์ด์ ์ค (์ด๋ํฐ) | CC BY-NC 4.0 โ ๋น์๋ฆฌ ์ฉ๋ ํ์ | | |
| | ๋ผ์ด์ ์ค (๋ฒ ์ด์ค ๋ชจ๋ธ) | Apache 2.0 | | |
| > **๋ฒ ์ด์ค ๋ชจ๋ธ ์์ฒด๋ Apache 2.0**์ด๋ผ ์์ฉ ๊ฐ๋ฅํ์ง๋ง, ๋ณธ ์ด๋ํฐ๋ ํ์ต ๋ฐ์ดํฐ(AI Hub #126)์ ๋น์๋ฆฌยท์ฐ๊ตฌ ๋ชฉ์ ์ ์ ์ ๋ง์ถฐ **CC BY-NC 4.0**์ผ๋ก ๋ฐฐํฌํฉ๋๋ค. ์์ฉ ์ฌ์ฉ์ ์ํ๋ค๋ฉด ์์ฒด ๋ฐ์ดํฐ๋ก ์ฌํ์ตํ์ธ์. | |
| --- | |
| ## ์ฑ๋ฅ (๊ท ํ ๊ฒ์ฆ์ 200์, AI Hub #126 12๊ฐ ์นดํ ๊ณ ๋ฆฌ ร ์๋ฐฉํฅ) | |
| | ๋ชจ๋ธ | BLEU | chrF++ | | |
| |---|---:|---:| | |
| | Kanana 1.5 2.1B base (๋ฒ ์ด์ค) | 13.78 | 38.11 | | |
| | **+ LoRA (988K, 1 epoch)** | **32.04** | **57.26** | | |
| | ฮ | **+18.26** | **+19.15** | | |
| ### ์๋งค Gemma ๋ชจ๋ธ๊ณผ์ ๋น๊ต (๋์ผ ๋ฐ์ดํฐยท๋์ผ ํ๊ฐ) | |
| | ๋ชจ๋ธ | ๋ฒ ์ด์ค BLEU | ๋ฒ ์ด์ค chrF++ | Post-SFT BLEU | Post-SFT chrF++ | | |
| |---|---:|---:|---:|---:| | |
| | Gemma 3 1B-IT + LoRA | 1.88 | 23.07 | 25.13 | 49.64 | | |
| | **Kanana 1.5 2.1B base + LoRA** | **13.78** | **38.11** | **32.04** | **57.26** | | |
| | ์ฐจ์ด | +11.90 | +15.04 | **+6.91** | **+7.62** | | |
| - Kanana base์ ์ถ๋ฐ์ (13.78 BLEU)์ด Gemma-IT ๋ฒ ์ด์ค(1.88)๋ณด๋ค ์๋ฑํ ๋์ โ ์นด์นด์ค์ ํ๊ตญ์ด ์ฌ์ ํ์ต ์ฐ์ | |
| - Post-SFT ์ ๋ ์ ์๋ Kanana ์ฐ์ (+6.91 BLEU, +7.62 chrF++) | |
| - SFT ๊ฒ์ธ(ฮ)์ ์ถ๋ฐ์ ์ด ๋ฎ์๋ Gemma๊ฐ ๋ ํผ โ ๋ฒ ์ด์ค ์์ฒด์ ํ๊ตญ์ด ๋ฅ๋ ฅ ์ฐจ์ด๊ฐ ๋ฐ์๋จ | |
| --- | |
| ## ํ์ผ ๋ชฉ๋ก | |
| | ํ์ผ | ํฌ๊ธฐ | ํ์ | | |
| |---|---:|---| | |
| | `adapter_model.safetensors` | 184 MB | PEFT LoRA ๊ฐ์ค์น | | |
| | `adapter_config.json` | 1.4 KB | PEFT ์ค์ (r=32, ฮฑ=64, target_modules) | | |
| | `tokenizer.json` | 17 MB | Kanana ํ ํฌ๋์ด์ (LFS) | | |
| | `tokenizer_config.json` / `special_tokens_map.json` / `chat_template.jinja` | ~70 KB | ํ ํฌ๋์ด์ ๋ฉํ๋ฐ์ดํฐ | | |
| | `eval_result.json` | < 1 KB | 200์ BLEU/chrF ๊ฒฐ๊ณผ | | |
| --- | |
| ## ์ฌ์ฉ๋ฒ | |
| ### ์ ๋ ฅ ํฌ๋งท (์ค์) | |
| ๋ณธ ์ด๋ํฐ๋ **base ๋ชจ๋ธ** ์์ ํ์ต๋์์ผ๋ฏ๋ก chat template์ ์ฌ์ฉํ์ง ์๊ณ raw Instruction/Response ํฌ๋งท์ ๋ฐ๋ฆ ๋๋ค: | |
| ``` | |
| ### Instruction: | |
| {์ง์๋ฌธ} | |
| ### Input: | |
| {์ ๋ ฅ ๋ฌธ์ฅ โ ์ต์ } | |
| ### Response: | |
| {์ถ๋ ฅ} | |
| ``` | |
| ### PyTorch + PEFT | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| from peft import PeftModel | |
| import torch | |
| base = AutoModelForCausalLM.from_pretrained( | |
| "kakaocorp/kanana-1.5-2.1b-base", | |
| torch_dtype=torch.bfloat16, | |
| device_map="auto", | |
| ) | |
| model = PeftModel.from_pretrained(base, "harveykim/kanana-1.5-2.1b-aihub-ko-en-lora") | |
| tok = AutoTokenizer.from_pretrained("harveykim/kanana-1.5-2.1b-aihub-ko-en-lora") | |
| prompt = ( | |
| "### Instruction:\n๋ค์ ํ๊ตญ์ด ๋ฌธ์ฅ์ ์์ด๋ก ๋ฒ์ญํ์ธ์.\n\n" | |
| "### Input:\n์ค๋ ์ ์ฌ์ ๊น์น์ฐ๊ฐ ์ด๋์?\n\n" | |
| "### Response:\n" | |
| ) | |
| inputs = tok(prompt, return_tensors="pt").to(model.device) | |
| out = model.generate(**inputs, max_new_tokens=128, do_sample=False) | |
| print(tok.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)) | |
| # โ "How about kimchi stew for lunch today?" | |
| ``` | |
| ### Unsloth (ํ์ตยทํ๊ฐ์ ๊ถ์ฅ, ๋ ๋น ๋ฆ) | |
| ```python | |
| from unsloth import FastModel | |
| model, tok = FastModel.from_pretrained( | |
| model_name="harveykim/kanana-1.5-2.1b-aihub-ko-en-lora", | |
| max_seq_length=1024, | |
| load_in_4bit=True, | |
| ) | |
| ``` | |
| --- | |
| ## ํ์ต ์ ๋ณด | |
| | ํญ๋ชฉ | ๊ฐ | | |
| |---|---| | |
| | ํ๋ ์์ํฌ | Unsloth + PEFT 0.19.1 + TRL 0.19.1 | | |
| | ์ ๋ฐ๋ | bfloat16 (Unsloth 4-bit ๋ฒ ์ด์ค) | | |
| | ๋ฐฐ์น | device๋น 16 ร grad-accum 2 (effective 32) | | |
| | ์ํ์ค ๊ธธ์ด | 1,024 | | |
| | ์ตํฐ๋ง์ด์ | AdamW 8-bit | | |
| | ํ์ต๋ฅ / ์ค์ผ์ค | 7e-5, cosine, 2% warmup | | |
| | Weight decay | 0.01 | | |
| | ์คํ ์ | 30,876 (988K / 32, 1 epoch) | | |
| | ์ด ํ์ต ์๊ฐ | 5.88 ์๊ฐ (RTX 3090) | | |
| | Train loss (์ต์ข ) | 1.377 | | |
| | Eval loss | 1.413 โ 1.299 (15ํ ํ๊ฐ, ๋จ์กฐ ๊ฐ์, ๊ณผ์ ํฉ ์์) | | |
| | Seed | 42 | | |
| ์ ์ฒด ํ์ต ์คํฌ๋ฆฝํธ: [`training/train_kanana.py`](https://github.com/DeepMav/on-device-translation/blob/main/training/train_kanana.py) | |
| 4๋จ๊ณ ์๋ํ ํ์ดํ๋ผ์ธ: [`training/run_pipeline_kanana.sh`](https://github.com/DeepMav/on-device-translation/blob/main/training/run_pipeline_kanana.sh) | |
| Gemma์์ ์์ธ ๋น๊ต ๋ฆฌํฌํธ: [`docs/COMPARISON_REPORT.md`](https://github.com/DeepMav/on-device-translation/blob/main/docs/COMPARISON_REPORT.md) | |
| > Gemma ๋ชจ๋ธ ๋๋น LR์ 7e-5๋ก ๋ณด์์ ์ผ๋ก ์ค์ ํ ์ด์ : Kanana 2.1B base๋ **instruct-tuned๊ฐ ์๋ raw base** ๋ชจ๋ธ์ด๋ผ, ๊ฐํ LR์ด ์ฌ์ ํ์ต๋ ํ๊ตญ์ด ํํ ๋ฅ๋ ฅ์ ์์์ํฌ ์ํ์ด ๋ ํฝ๋๋ค. | |
| --- | |
| ## ํ๊ณ์ | |
| - **๋๋ฉ์ธ ๋ถํฌ ํธํฅ** โ AI Hub #126 ๋ถํฌ (๋ํ์ฒดยท๊ตฌ์ด์ฒด ์ค์ฌ)์ ๋ง์ถฐ์ ธ ์์ด ์ยท๋ฌธํยท์ ๋ฌธ ๋ถ์ผ๋ ๋ณด์ฅ ์์ | |
| - **1 epoch ๋จ๋ฐฉํฅ ํ์ต** โ Eval loss๊ฐ ๋จ์กฐ ๊ฐ์ ์ค์ด๋ฉฐ plateau ๋ฏธ๋๋ฌ. ์ถ๊ฐ ํ์ต ์ฌ์ง ์์ | |
| - **Base ๋ชจ๋ธ ์ถ๋ ฅ ํ์ ์์กด** โ Instruction/Response ํฌ๋งท์ ์ ํํ ๋ฐ๋ฅด์ง ์์ผ๋ฉด ํ์ง ์ ํ | |
| --- | |
| ## ๋ผ์ด์ ์ค | |
| - **์ด๋ํฐ ๊ฐ์ค์น (์ด ๋ ํฌ)** โ **CC BY-NC 4.0** (๋น์๋ฆฌยท์ฐ๊ตฌ ๋ชฉ์ ). ์ถ์ฒ ํ๊ธฐ ์ ์์ ๋กญ๊ฒ ์ฌ์ฉ ๊ฐ๋ฅ | |
| - **๋ฒ ์ด์ค ๋ชจ๋ธ** โ Apache 2.0 ([`kakaocorp/kanana-1.5-2.1b-base`](https://huggingface.co/kakaocorp/kanana-1.5-2.1b-base)) | |
| - **๋ฐ์ดํฐ์ ** โ AI Hub #126์ NIA(ํ๊ตญ์ง๋ฅ์ ๋ณด์ฌํ์งํฅ์)๊ฐ ์ฐ๊ตฌ ๋ชฉ์ ์ผ๋ก ์ ๊ณตํ๋ฉฐ ์ฌ๋ฐฐํฌ ๊ธ์ง. ์๋ณธ ๋ฐ์ดํฐ๋ ๋ณธ ๋ ํฌ์ ํฌํจ๋์ง ์์ | |
| ### ์์ฉ ์ฌ์ฉ์ ์ํ๋ค๋ฉด | |
| ๋ฒ ์ด์ค ๋ชจ๋ธ ์์ฒด๋ Apache 2.0์ด๋ฏ๋ก, ๋ณธ์ธ ๋ฐ์ดํฐ๋ก ๋ณ๋ LoRA๋ฅผ ์ฌํ์ตํ๋ฉด ์์ฉ ๊ฐ๋ฅํฉ๋๋ค. AI Hub #126 ํ์ต ๊ฒฐ๊ณผ๋ฌผ์ธ ์ด ์ด๋ํฐ๋ ๋น์๋ฆฌ ์ฉ๋๋ก๋ง ์ฌ์ฉ ๊ฐ๋ฅํฉ๋๋ค. | |
| --- | |
| ## ์ธ์ฉ | |
| ```bibtex | |
| @misc{kanana_aihub_ko_en_lora_2026, | |
| title = {kanana-1.5-2.1b-aihub-ko-en-lora: Korean-English Translation LoRA for Kanana 1.5}, | |
| author = {Kim, Harvey}, | |
| year = {2026}, | |
| url = {https://huggingface.co/harveykim/kanana-1.5-2.1b-aihub-ko-en-lora} | |
| } | |
| ``` | |
| ## ๊ฐ์ฌ์ ๊ธ | |
| - ๋ฒ ์ด์ค ๋ชจ๋ธ: [Kanana 1.5 2.1B base](https://huggingface.co/kakaocorp/kanana-1.5-2.1b-base) โ ์นด์นด์ค (Apache 2.0) | |
| - ํ์ต ํ๋ ์์ํฌ: [Unsloth](https://github.com/unslothai/unsloth) | |
| - ๋ฐ์ดํฐ์ : [AI Hub #126 ํ๊ตญ์ด-์์ด ๋ฒ์ญ(๋ณ๋ ฌ) ๋ง๋ญ์น](https://www.aihub.or.kr/aihubdata/data/view.do?dataSetSn=126) โ ์ฐ๊ตฌ ๋ชฉ์ ํ์ , ์๋ณธ ๋ฐ์ดํฐ๋ ๋ณธ ๋ ํฌ์ ๋ฏธํฌํจ | |
| ๋ณธ ๋ชจ๋ธ์ ๊ณผํ๊ธฐ์ ์ ๋ณดํต์ ๋ถ์ ์ฌ์์ผ๋ก ํ๊ตญ์ง๋ฅ์ ๋ณด์ฌํ์งํฅ์์ ์ง์์ ๋ฐ์ ๊ตฌ์ถ๋ "ํ๊ตญ์ด-์์ด ๋ฒ์ญ(๋ณ๋ ฌ) ๋ง๋ญ์น"(AI Hub, ๋ฐ์ดํฐ์ ๋ฒํธ 126)๋ฅผ ํ์ฉํ์ฌ ํ์ต๋์์ต๋๋ค. | |