tw-wm-token-match-llama-step171 / model-00001-of-00004.safetensors

Commit History

LLaMA3.1-8B TextWorld Token-Match GRPO step-171
27d71b3
verified

Ricardo-H commited on