Tengyunw
/

qwen3_30b_moe_eagle3

Model card Files Files and versions

qwen3_30b_moe_eagle3 / README.md

Lil2J's picture

Update README.md

8bc2b57 verified 10 months ago

|

1.84 kB

	---
	license: mit
	base_model:
	- Qwen/Qwen3-8B
	---

	## Introduce
	We adapted the official speculative sampling training method, Eagle3, for training on Qwen3-30B-A3B

	After implementing Eagle3, the inference performance of Qwen3-30B-Moe using the SGLang framework on 8*H200 GPU improved from 183 tokens/s to 325 tokens/s.

	The TPS (tokens per second) improvement reached nearly 70%.

	On a single RTX 5090, the TPS (transactions per second) of Qwen3-8B-Eagle3 increased from 164 to 268.


	\| model \| gpu \| tps \|
	\|---------\|---------\|---------\|
	\| qwen3-30b_moe \| h200 \| 147 \|
	\| qwen3-30b-moe_eagle3 \| h200 \| 231 \|
	\| qwen3-30b_moe \| 8*h200 \| 183 \|
	\| qwen3-30b_moe-eagle3 \| 8*h200 \| 325 \|
	\| qwen3-30b_moe \| 8*5090 \| 164 \|
	\| qwen3-30b_moe-eagle3 \| 8*5090 \| 268 \|

	Join our AI computing power cloud platform now and enjoy the best AI cloud service experience. The link is as follows: https://tenyunn.com/
	## How to use

	To use Eagle3 with SGLang, first replace the qwen3_moe.py file in SGLang’s directory (sglang/python/sglang/srt/models/) with the qwen3_moe.py file from this project.


	The launch command for using Eagle3 with SGLang is:

	```python3

	python3 -m sglang.launch_server --model Qwen/Qwen3-30B-A3B --speculative-algorithm EAGLE3 --speculative-draft-model-path Tengyunw/qwen3_30b_moe_eagle3 --speculative-num-steps 6 --speculative-eagle-topk 10 --speculative-num-draft-tokens 32 --mem-fraction 0.9 --cuda-graph-max-bs 2 --dtype bfloat16

	```

	## How to train

	Training Dataset:
	ultrachat_200k.
	Only the prompts from these datasets were utilized for data synthesis. This synthesized data is used to train the Eagle modules.

	dataset nums: 600K samples,1B tokens

	Evaluation Dataset:
	ShareGPT,GSM8K,HUAMEVAL,MT-BENCH,APLCA

	Our Sharegpt test data is located in the eagle_data.jsonl file under this directory.