Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Ricardo-H
's Collections
WS TM-WM Sweep - Qwen3 Agents (40k)
BehR-WM (LLaMA3.1-8B) TextWorld WM/W2R Trajectories
TW WM-TM (LLaMA3.1-8B) Step171 TextWorld WM/W2R Trajectories
WebShop TM-WM Checkpoint Sweep - Qwen3-32B Agent (32k, TP=4)
TW WM-TM Step170 TextWorld WM/W2R Trajectories
tw-wm-tm-0501
Step92 WebShop WM/W2R Trajectories
ws-llama-webshop-token-match-0429
OCAR · Surprise Agent-RL (Archived)
BehR: Behavior-Consistent World Models
alfworld-dual-token-0416
ws-wm-0410ministral
grpo-alfworld-0410
ws-wm-crossjudge-llama-0406
rlvr-f1-llama-textworld-f1
rlvr-f1-llama-webshop-f1
rlvr-f1
ws-wm-0314
ws-wm-f1-0314
ws-wm-llama-0227
ws-wm-0224
BehR-WM (LLaMA3.1-8B) TextWorld WM/W2R Trajectories
updated
May 2
WM/W2R trajectories of Ricardo-H/BehR-WorldModel-Textworld-Llama3.1-8B on TextWorld test split.
Upvote
-
Ricardo-H/tw-behr-llama-3.1-8b-textworld-wm-w2r-qwen3-8b
Updated
May 2
•
425
Ricardo-H/tw-behr-llama-3.1-8b-textworld-wm-w2r-qwen3-32b
Updated
May 2
•
421
Upvote
-
Share collection
View history
Collection guide
Browse collections