Instructions to use DAMO-NLP-SG/Mistral-7B-LongPO-128K with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use DAMO-NLP-SG/Mistral-7B-LongPO-128K with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="DAMO-NLP-SG/Mistral-7B-LongPO-128K")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("DAMO-NLP-SG/Mistral-7B-LongPO-128K")
model = AutoModelForCausalLM.from_pretrained("DAMO-NLP-SG/Mistral-7B-LongPO-128K")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use DAMO-NLP-SG/Mistral-7B-LongPO-128K with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "DAMO-NLP-SG/Mistral-7B-LongPO-128K"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DAMO-NLP-SG/Mistral-7B-LongPO-128K",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/DAMO-NLP-SG/Mistral-7B-LongPO-128K

SGLang

How to use DAMO-NLP-SG/Mistral-7B-LongPO-128K with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "DAMO-NLP-SG/Mistral-7B-LongPO-128K" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DAMO-NLP-SG/Mistral-7B-LongPO-128K",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "DAMO-NLP-SG/Mistral-7B-LongPO-128K" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DAMO-NLP-SG/Mistral-7B-LongPO-128K",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use DAMO-NLP-SG/Mistral-7B-LongPO-128K with Docker Model Runner:
```
docker model run hf.co/DAMO-NLP-SG/Mistral-7B-LongPO-128K
```

LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization

This repo provides the checkpoint of Mistral-7B-LongPO-128K in our paper "LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization".

Highlights of LongPO

Self-evolving long-context alignment without human/superior LLMs annotations.
Extending context length while keeping aligned in one stage.
No degradation on short-context capabilities.

Models and Training Data

Models	Base Model	Training Data	# Data Samples
Mistral-7B-LongPO-128K	Mistral-7B-Instruct-v0.2	HF Link	45K
Qwen2.5-7B-LongPO-128K	Qwen2.5-7B-Instruct	HF Link	32K
Mistral-7B-LongPO-256K-EXP*	Mistral-7B-LongPO-128K	HF Link	16K
Mistral-7B-LongPO-512K-EXP*	Mistral-7B-LongPO-128K	HF Link	2.5K

* indicates an experimental version (for rebuttal purposes) that may have not been fully tuned or provided with sufficient data to achieve convergence.

Evaluation

InfiniteBench

Model	Train/Claimed Length	En.Sum	En.QA	En.MC	AVG.
GPT-4-128K	128K	14.73	22.44	67.25	34.81
Qwen2-72B	128K	24.32ᵇ	7.03ᵇ	72.05ᵇ	34.47ᵇ
LLaMA 3.1-70B	128K	33.55ᵇ	36.08ᵇ	69.00ᵇ	46.21ᵇ
LLaMA 3.1-8B	128K	28.06ᵇ	30.47ᵇ	58.08ᵇ	38.87ᵇ
GLM-4-9B	128K	14.84ᵇ	9.51ᵇ	67.25ᵇ	30.53ᵇ
GLM-4-9B-1M	1M	28.3	9.7	68.6	35.53
LWM-7B-1M	1M	4.33ᵇ	0.0ᵇ	3.06ᵇ	2.46ᵇ
YaRN-Mistral-7B	128K	9.09	9.55	27.95	15.53
Mistral-7B	32K	22.13	4.93	14.41	13.82
- SFT	128K	23.44	13.45	53.21	30.03
- DPO	128K	15.21	10.34	48.14	25.56
- LongPO (iter1)	128K	27.05	23.51	67.25	39.27
- LongPO (iter2)	256K	28.16	24.43	66.35	39.65
- LongPO (iter3)	512K	29.10	27.85	66.67	41.21
Qwen2.5-7B	128K	22.89	6.08	52.4	27.12
- LongPO (iter1)	128K	32.06	17.32	72.05	40.48

Our results are evaluated with greedy decoding.
Baseline results marked with ᵇ are evaluated by us, while unmarked baseline results are sourced from their official report.

RULER

Model	NIAH	VT	AGG	QA	AVG (13 tasks)
Qwen2.5-7B-Instruct	82.10	80.09	74.50	54.30	76.50
Qwen2.5-7B-LongPO-128K	95.82	89.71	78.67	59.40	87.11
Mistral-7B-Instruct-v0.2	72.60	74.40	64.40	52.20	68.40
Mistral-7B-LongPO-128K	96.88	96.49	71.55	64.81	88.02
Mistral-7B-LongPO-256K-EXP	96.80	97.00	69.14	64.87	87.65
Mistral-7B-LongPO-512K-EXP	97.28	97.48	69.22	64.92	88.00

Short Context

Model	MMLU	ARC-C	Hellaswag	Winogrande	Avg
Mistral-7B-Instruct-v0.2	59.15	59.26	83.2	78.4	70.00
Mistral-7B-LongPO-128K	59.99	59.34	82.99	78.53	70.21
Mistral-7B-LongPO-256K-EXP	59.47	60.28	83.14	78.14	70.26
Mistral-7B-LongPO-512K-EXP	59.51	60.58	82.87	77.66	70.16
Qwen2.5-7B-Instruct	74.28	67.15	81.41	74.66	74.38
Qwen2.5-7B-LongPO-128K	73.64	65.70	80.82	74.98	73.79

Citation

If you find our project useful, hope you can star our repo and cite our paper as follows:

@inproceedings{
    chen2025longpo,
    title={Long{PO}: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization},
    author={Guanzheng Chen and Xin Li and Michael Shieh and Lidong Bing},
    booktitle={The Thirteenth International Conference on Learning Representations},
    year={2025},
    url={https://openreview.net/forum?id=qTrEq31Shm}
}