How to use from
vLLM
Install from pip and serve model
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "oro-ai/qwen3-4b-shoppingbench-kto"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "oro-ai/qwen3-4b-shoppingbench-kto",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'
Use Docker
docker model run hf.co/oro-ai/qwen3-4b-shoppingbench-kto
Quick Links

Qwen3-4B ShoppingBench SFT + KTO

Paper: arXiv:2606.10064
Code: https://github.com/ORO-AI/shoppingbench-trajectory-primitive

KTO preference refinement (v3) applied on top of the merged opd_renderers SFT champion. Reaches 42.7% ASR on the 75-problem leak-cluster-guarded held-out partition (production-strict, temp 0), matching the SFT champion.

Companion artifact for the paper Bittensor Agent Arenas as a Trajectory Primitive: Distilling a Shopping Agent from ShoppingBench Subnet Traces. The published Qwen3-4B base scores 18.0% ASR on ShoppingBench; the distilled SFT-family models in this collection lift that to 42.7% on a leak-cluster-guarded held-out partition scored production-strict.

This is a merged full model (Qwen3-4B weights with the trained delta merged in), ready to load directly with transformers or serve with vLLM. No adapter stacking required.

Training data

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("oro-ai/qwen3-4b-shoppingbench-kto")
model = AutoModelForCausalLM.from_pretrained("oro-ai/qwen3-4b-shoppingbench-kto", torch_dtype="bfloat16", device_map="auto")

License

Apache-2.0, inherited from the Qwen3-4B base model.

Downloads last month
22
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for oro-ai/qwen3-4b-shoppingbench-kto

Finetuned
Qwen/Qwen3-4B
Finetuned
(716)
this model

Collection including oro-ai/qwen3-4b-shoppingbench-kto

Paper for oro-ai/qwen3-4b-shoppingbench-kto