WM/W2R trajectories of WebShop TM-WM (step60/80/92) with Qwen3-8B and Qwen3-32B agents at 40k context.
YOULING HUANG
Ricardo-H
·
AI & ML interests
None yet
Recent Activity
updated a dataset about 1 month ago
Ricardo-H/ws-w2w-llama-3.1-8b-webshop-wm-w2r-qwen3-32b published a dataset about 1 month ago
Ricardo-H/ws-w2w-llama-3.1-8b-webshop-wm-w2r-qwen3-32b updated a collection about 1 month ago
WS TM-WM Sweep - Qwen3 Agents (40k)Organizations
None yet
models 161
Ricardo-H/tw-wm-token-match-llama-step171
8B • Updated • 1
Ricardo-H/tw-wm-tm-0501-step-170
8B • Updated • 2
Ricardo-H/ws-qwen-webshop-token-match-0430-step-84
8B • Updated
Ricardo-H/ws-llama-webshop-token-match-0429-step-92
8B • Updated • 1
Ricardo-H/ws-llama-webshop-token-match-0429-step-80
8B • Updated • 1
Ricardo-H/ws-llama-webshop-token-match-0429-step-60
8B • Updated • 1
Ricardo-H/ocar-gigpo-observe-alfworld-1.5b
2B • Updated
Ricardo-H/ocar-grpo-observe-alfworld-1.5b
2B • Updated
Ricardo-H/ocar-v3-alfworld-7b
8B • Updated
Ricardo-H/ocar-grpo-observe-alfworld-7b
8B • Updated
datasets 23
Ricardo-H/ws-w2w-llama-3.1-8b-webshop-wm-w2r-qwen3-32b
Updated • 426
Ricardo-H/ws-step92-webshop-wm-w2r-qwen3-32b-40k
Updated • 424
Ricardo-H/ws-step92-webshop-wm-w2r-qwen3-8b-40k
Updated • 635
Ricardo-H/ws-step80-webshop-wm-w2r-qwen3-32b-40k
Updated • 397
Ricardo-H/ws-step80-webshop-wm-w2r-qwen3-8b-40k
Updated • 429
Ricardo-H/ws-step60-webshop-wm-w2r-qwen3-32b-40k
Updated • 434
Ricardo-H/ws-step60-webshop-wm-w2r-qwen3-8b-40k
Updated • 448
Ricardo-H/tw-behr-llama-3.1-8b-textworld-wm-w2r-qwen3-32b
Updated • 421
Ricardo-H/tw-behr-llama-3.1-8b-textworld-wm-w2r-qwen3-8b
Updated • 425
Ricardo-H/tw-step171-llama-textworld-wm-w2r-qwen3-32b
Updated • 416