OLMoE_1B_7B_Eagle3 / README.md
wantsleep's picture
.
d0c3269
|
Raw
History Blame
3.12 kB

OLMoE-1B-7B-Eagle3 Draft Model

์ด ์ €์žฅ์†Œ๋Š” OLMoE-1B-7B-Eagle3 ๊ธฐ๋ฐ˜์˜ EAGLE Draft ๋ชจ๋ธ ๊ฐ€์ค‘์น˜์™€ ๊ด€๋ จ ์ฝ”๋“œ, ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.


๐Ÿ“ฆ ํฌํ•จ ํŒŒ์ผ

  • pytorch_model.bin : ํ•™์Šต๋œ EAGLE Draft ๋ชจ๋ธ ๊ฐ€์ค‘์น˜
  • config.json : ๋ชจ๋ธ ์„ค์ • ํŒŒ์ผ (OLMoE ๊ตฌ์กฐ)
  • tokenizer_config.json : ํ† ํฌ๋‚˜์ด์ € ์„ค์ • ํŒŒ์ผ
  • modeling_olmoe_kv.py : OLMoE ์ „์šฉ ๋ชจ๋ธ ์ฝ”๋“œ (EAGLE ์ถ”๋ก  ์‹œ ํ•„์š”)
  • eagle_data.json : ํ•™์Šต์— ์‚ฌ์šฉ๋œ ๋ฐ์ดํ„ฐ์…‹ (ShareGPT ์งˆ๋ฌธ + OLMoE ๋‹ต๋ณ€)
  • .gitattributes : Git LFS ์„ค์ • ๋“ฑ

๐Ÿฆ… EAGLE Draft ๋ชจ๋ธ์ด๋ž€?

EAGLE์€ ๋Œ€๊ทœ๋ชจ ์–ธ์–ด๋ชจ๋ธ(LLM)์˜ ์ถ”๋ก  ์†๋„๋ฅผ ํš๊ธฐ์ ์œผ๋กœ ๋†’์ด๊ธฐ ์œ„ํ•ด,
Draft(์ดˆ์•ˆ) ๋””์ฝ”๋” ๊ณ„์ธต์„ ๋ณ„๋„๋กœ ํ•™์Šต์‹œํ‚ค๋Š” ๊ตฌ์กฐ์ž…๋‹ˆ๋‹ค.

  • OLMoE-1B-7B-0125-Instruct์˜ ๊ตฌ์กฐ์™€ ํ˜ธํ™˜
  • EAGLE Draft ๊ณ„์ธต์€ Main Model์˜ ๋””์ฝ”๋”์™€ ๊ตฌ์กฐ์ ์œผ๋กœ ์œ ์‚ฌํ•˜๊ฒŒ ์„ค๊ณ„๋จ
  • ์ถ”๋ก  ์‹œ, Draft ๊ณ„์ธต์ด ์—ฌ๋Ÿฌ ํ† ํฐ์„ ๋ฏธ๋ฆฌ ์ƒ์„ฑ โ†’ Main Model์ด ๊ฒ€์ฆ/accept

๐Ÿ“ ํ•™์Šต ๋ฐ์ดํ„ฐ ์„ค๋ช…

  • eagle_data.json
    • ShareGPT ๋ฐ์ดํ„ฐ์…‹์—์„œ ์งˆ๋ฌธ(ํ”„๋กฌํ”„ํŠธ)๋งŒ ์ถ”์ถœ
    • ๊ฐ ์งˆ๋ฌธ์— ๋Œ€ํ•ด allenai/OLMoE-1B-7B-0125-Instruct ๋ชจ๋ธ์ด ์ง์ ‘ ๋‹ต๋ณ€์„ ์ƒ์„ฑ
    • ์ฆ‰, ๋ชจ๋ธ์ด ์Šค์Šค๋กœ ์ƒ์„ฑํ•œ ๋‹ต๋ณ€์„ ์ •๋‹ต์œผ๋กœ ์‚ฌ์šฉํ•˜์—ฌ Draft ๊ณ„์ธต์„ ํ•™์Šต
    • ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด, Draft ๊ณ„์ธต์ด Main Model์˜ ๋””์ฝ”๋”์™€ ๋” ๊ฐ€๊นŒ์šด ๋ถ„ํฌ๋ฅผ ํ•™์Šตํ•˜๊ฒŒ ๋˜์–ด
      EAGLE ์ถ”๋ก  ์„ฑ๋Šฅ์ด ๊ทน๋Œ€ํ™”๋จ

๐Ÿ› ๏ธ ์‚ฌ์šฉ๋ฒ•

1. ๋ชจ๋ธ ๊ฐ€์ค‘์น˜/์„ค์ • ํŒŒ์ผ ์‚ฌ์šฉ

  • pytorch_model.bin, config.json, tokenizer_config.json์„
    HuggingFace Transformers ๋˜๋Š” EAGLE ์ฝ”๋“œ์—์„œ ๋ฐ”๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

2. EAGLE Inference ์ฝ”๋“œ์— ์ ์šฉ

  • modeling_olmoe_kv.py ํŒŒ์ผ์„
    EAGLE ๊ณต์‹ ์ €์žฅ์†Œ์˜ EAGLE/eagle/model/ ๋””๋ ‰ํ† ๋ฆฌ์— ๋ณต์‚ฌ/๋ฎ์–ด์“ฐ๊ธฐ ํ•˜์„ธ์š”.
  • EAGLE ์ถ”๋ก  ์Šคํฌ๋ฆฝํŠธ์—์„œ
    from eagle.model.modeling_olmoe_kv import OlmoeForCausalLM
    ํ˜•ํƒœ๋กœ import ํ•˜์—ฌ ์‚ฌ์šฉํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

3. ์˜ˆ์‹œ ์ฝ”๋“œ

from transformers import AutoTokenizer
from eagle.model.ea_model import EaModel

tokenizer = AutoTokenizer.from_pretrained('allenai/OLMoE-1B-7B-0125-Instruct')
model = EaModel.from_pretrained(
    base_model_path='allenai/OLMoE-1B-7B-0125-Instruct',
    ea_model_path='์ด ์ €์žฅ์†Œ ๊ฒฝ๋กœ',
    torch_dtype='bfloat16'
)

โš ๏ธ ์ฐธ๊ณ /์œ ์˜์‚ฌํ•ญ

  • eagle_data.json์€ ๊ณต๊ฐœ๋œ ShareGPT ์งˆ๋ฌธ์— ๋Œ€ํ•ด OLMoE๊ฐ€ ์ƒ์„ฑํ•œ ๋‹ต๋ณ€๋งŒ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.
  • EAGLE Draft ๊ณ„์ธต์€ Main Model์˜ ๊ตฌ์กฐ์™€ ์ตœ๋Œ€ํ•œ ์œ ์‚ฌํ•˜๊ฒŒ ์„ค๊ณ„๋˜์–ด์•ผ
    ์ถ”๋ก  ํšจ์œจ์ด ๊ทน๋Œ€ํ™”๋ฉ๋‹ˆ๋‹ค.
  • modeling_olmoe_kv.py๋Š” ๋ฐ˜๋“œ์‹œ EAGLE ์ถ”๋ก  ์ฝ”๋“œ์— ํฌํ•จ๋˜์–ด์•ผ ์ •์ƒ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ“š ์ธ์šฉ/์ฐธ๊ณ 


๋ฌธ์˜/ํ”ผ๋“œ๋ฐฑ์€ ์ด์Šˆ๋กœ ๋‚จ๊ฒจ์ฃผ์„ธ์š”!