nm-testing
/

llama4-scout-17b-eagle3-dummy-drafter

speculative-decoding

Model card Files Files and versions

llama4-scout-17b-eagle3-dummy-drafter / README.md

RelaxingSnorlax's picture

RelaxingSnorlax

Upload Llama4 Eagle3 dummy drafter for vLLM testing

217d835 verified 10 months ago

|

947 Bytes

Llama4 Eagle Drafter Model (Test)

This is a test Eagle drafter model for Llama4 with proper configuration and vocabulary mappings.

Model Details

Architecture: Llama4ForCausalLM (Eagle draft variant)
Hidden size: 2048
Layers: 1 (single decoder layer for Eagle draft)
Vocabulary: 128256 tokens (Llama4)
Includes: d2t and t2d vocabulary mappings

Configuration

Uses standard Llama4 architecture
Includes Eagle auxiliary state configuration
Has vocabulary mapping tensors (d2t/t2d) for draft-to-target conversion
Extended context support (262k max position embeddings)

Usage

This model is for testing Eagle speculative decoding with Llama4 in vLLM:

vllm serve <llama4-target-model> \
    --speculative-config '{"method": "eagle", "model": "nm-testing/llama4-eagle-drafter", ...}'

Testing Purpose

This model contains random weights and is only for vLLM Eagle implementation testing.