OLMoE_1B_7B_Eagle3 / README.md
wantsleep's picture
.
d0c3269
|
Raw
History Blame
3.12 kB
# OLMoE-1B-7B-Eagle3 Draft Model
์ด ์ €์žฅ์†Œ๋Š” OLMoE-1B-7B-Eagle3 ๊ธฐ๋ฐ˜์˜ EAGLE Draft ๋ชจ๋ธ ๊ฐ€์ค‘์น˜์™€ ๊ด€๋ จ ์ฝ”๋“œ, ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
---
## ๐Ÿ“ฆ ํฌํ•จ ํŒŒ์ผ
- `pytorch_model.bin` : ํ•™์Šต๋œ EAGLE Draft ๋ชจ๋ธ ๊ฐ€์ค‘์น˜
- `config.json` : ๋ชจ๋ธ ์„ค์ • ํŒŒ์ผ (OLMoE ๊ตฌ์กฐ)
- `tokenizer_config.json` : ํ† ํฌ๋‚˜์ด์ € ์„ค์ • ํŒŒ์ผ
- `modeling_olmoe_kv.py` : OLMoE ์ „์šฉ ๋ชจ๋ธ ์ฝ”๋“œ (EAGLE ์ถ”๋ก  ์‹œ ํ•„์š”)
- `eagle_data.json` : ํ•™์Šต์— ์‚ฌ์šฉ๋œ ๋ฐ์ดํ„ฐ์…‹ (ShareGPT ์งˆ๋ฌธ + OLMoE ๋‹ต๋ณ€)
- `.gitattributes` : Git LFS ์„ค์ • ๋“ฑ
---
## ๐Ÿฆ… EAGLE Draft ๋ชจ๋ธ์ด๋ž€?
EAGLE์€ ๋Œ€๊ทœ๋ชจ ์–ธ์–ด๋ชจ๋ธ(LLM)์˜ ์ถ”๋ก  ์†๋„๋ฅผ ํš๊ธฐ์ ์œผ๋กœ ๋†’์ด๊ธฐ ์œ„ํ•ด,
**Draft(์ดˆ์•ˆ) ๋””์ฝ”๋” ๊ณ„์ธต**์„ ๋ณ„๋„๋กœ ํ•™์Šต์‹œํ‚ค๋Š” ๊ตฌ์กฐ์ž…๋‹ˆ๋‹ค.
- **OLMoE-1B-7B-0125-Instruct**์˜ ๊ตฌ์กฐ์™€ ํ˜ธํ™˜
- EAGLE Draft ๊ณ„์ธต์€ Main Model์˜ ๋””์ฝ”๋”์™€ ๊ตฌ์กฐ์ ์œผ๋กœ ์œ ์‚ฌํ•˜๊ฒŒ ์„ค๊ณ„๋จ
- ์ถ”๋ก  ์‹œ, Draft ๊ณ„์ธต์ด ์—ฌ๋Ÿฌ ํ† ํฐ์„ ๋ฏธ๋ฆฌ ์ƒ์„ฑ โ†’ Main Model์ด ๊ฒ€์ฆ/accept
---
## ๐Ÿ“ ํ•™์Šต ๋ฐ์ดํ„ฐ ์„ค๋ช…
- **eagle_data.json**
- ShareGPT ๋ฐ์ดํ„ฐ์…‹์—์„œ **์งˆ๋ฌธ(ํ”„๋กฌํ”„ํŠธ)๋งŒ ์ถ”์ถœ**
- ๊ฐ ์งˆ๋ฌธ์— ๋Œ€ํ•ด **allenai/OLMoE-1B-7B-0125-Instruct** ๋ชจ๋ธ์ด ์ง์ ‘ ๋‹ต๋ณ€์„ ์ƒ์„ฑ
- ์ฆ‰, **๋ชจ๋ธ์ด ์Šค์Šค๋กœ ์ƒ์„ฑํ•œ ๋‹ต๋ณ€**์„ ์ •๋‹ต์œผ๋กœ ์‚ฌ์šฉํ•˜์—ฌ Draft ๊ณ„์ธต์„ ํ•™์Šต
- ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด, Draft ๊ณ„์ธต์ด Main Model์˜ ๋””์ฝ”๋”์™€ ๋” ๊ฐ€๊นŒ์šด ๋ถ„ํฌ๋ฅผ ํ•™์Šตํ•˜๊ฒŒ ๋˜์–ด
EAGLE ์ถ”๋ก  ์„ฑ๋Šฅ์ด ๊ทน๋Œ€ํ™”๋จ
---
## ๐Ÿ› ๏ธ ์‚ฌ์šฉ๋ฒ•
### 1. ๋ชจ๋ธ ๊ฐ€์ค‘์น˜/์„ค์ • ํŒŒ์ผ ์‚ฌ์šฉ
- `pytorch_model.bin`, `config.json`, `tokenizer_config.json`์„
HuggingFace Transformers ๋˜๋Š” EAGLE ์ฝ”๋“œ์—์„œ ๋ฐ”๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
### 2. EAGLE Inference ์ฝ”๋“œ์— ์ ์šฉ
- `modeling_olmoe_kv.py` ํŒŒ์ผ์„
EAGLE ๊ณต์‹ ์ €์žฅ์†Œ์˜ `EAGLE/eagle/model/` ๋””๋ ‰ํ† ๋ฆฌ์— ๋ณต์‚ฌ/๋ฎ์–ด์“ฐ๊ธฐ ํ•˜์„ธ์š”.
- EAGLE ์ถ”๋ก  ์Šคํฌ๋ฆฝํŠธ์—์„œ
`from eagle.model.modeling_olmoe_kv import OlmoeForCausalLM`
ํ˜•ํƒœ๋กœ import ํ•˜์—ฌ ์‚ฌ์šฉํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.
### 3. ์˜ˆ์‹œ ์ฝ”๋“œ
```python
from transformers import AutoTokenizer
from eagle.model.ea_model import EaModel
tokenizer = AutoTokenizer.from_pretrained('allenai/OLMoE-1B-7B-0125-Instruct')
model = EaModel.from_pretrained(
base_model_path='allenai/OLMoE-1B-7B-0125-Instruct',
ea_model_path='์ด ์ €์žฅ์†Œ ๊ฒฝ๋กœ',
torch_dtype='bfloat16'
)
```
---
## โš ๏ธ ์ฐธ๊ณ /์œ ์˜์‚ฌํ•ญ
- **eagle_data.json**์€ ๊ณต๊ฐœ๋œ ShareGPT ์งˆ๋ฌธ์— ๋Œ€ํ•ด OLMoE๊ฐ€ ์ƒ์„ฑํ•œ ๋‹ต๋ณ€๋งŒ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.
- EAGLE Draft ๊ณ„์ธต์€ Main Model์˜ ๊ตฌ์กฐ์™€ ์ตœ๋Œ€ํ•œ ์œ ์‚ฌํ•˜๊ฒŒ ์„ค๊ณ„๋˜์–ด์•ผ
์ถ”๋ก  ํšจ์œจ์ด ๊ทน๋Œ€ํ™”๋ฉ๋‹ˆ๋‹ค.
- `modeling_olmoe_kv.py`๋Š” ๋ฐ˜๋“œ์‹œ EAGLE ์ถ”๋ก  ์ฝ”๋“œ์— ํฌํ•จ๋˜์–ด์•ผ ์ •์ƒ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค.
---
## ๐Ÿ“š ์ธ์šฉ/์ฐธ๊ณ 
- [EAGLE: Fast Decoding for Large Language Models](https://github.com/SafeAILab/EAGLE)
- [allenai/OLMoE-1B-7B-0125-Instruct](https://huggingface.co/allenai/OLMoE-1B-7B-0125-Instruct)
- [ShareGPT Dataset](https://huggingface.co/datasets/sharegpt)
---
๋ฌธ์˜/ํ”ผ๋“œ๋ฐฑ์€ ์ด์Šˆ๋กœ ๋‚จ๊ฒจ์ฃผ์„ธ์š”!