Llama4 Eagle Drafter Model (Test)
This is a test Eagle drafter model for Llama4 with proper configuration and vocabulary mappings.
Model Details
- Architecture: Llama4ForCausalLM (Eagle draft variant)
- Hidden size: 2048
- Layers: 1 (single decoder layer for Eagle draft)
- Vocabulary: 128256 tokens (Llama4)
- Includes: d2t and t2d vocabulary mappings
Configuration
- Uses standard Llama4 architecture
- Includes Eagle auxiliary state configuration
- Has vocabulary mapping tensors (d2t/t2d) for draft-to-target conversion
- Extended context support (262k max position embeddings)
Usage
This model is for testing Eagle speculative decoding with Llama4 in vLLM:
vllm serve <llama4-target-model> \
--speculative-config '{"method": "eagle", "model": "nm-testing/llama4-eagle-drafter", ...}'
Testing Purpose
This model contains random weights and is only for vLLM Eagle implementation testing.