Building on HF

4 41 27

Elena M

borntobeignored

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning

upvoted an article 3 days ago

Holo3.1: Fast & Local Computer Use Agents

liked a model 4 days ago

poolside/Laguna-M.1

View all activity

Organizations

upvoted a paper 2 days ago

From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning

Paper • 2606.17682 • Published 6 days ago • 25

upvoted an article 3 days ago

Article

Holo3.1: Fast & Local Computer Use Agents

Hcompany

•

20 days ago

• 32

upvoted 4 papers 9 days ago

Toward Generalist Autonomous Research via Hypothesis-Tree Refinement

Paper • 2606.11926 • Published 12 days ago • 115

On the Geometry of On-Policy Distillation

Paper • 2606.07082 • Published 17 days ago • 72

Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks

Paper • 2606.12344 • Published 12 days ago • 68

EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments

Paper • 2606.13681 • Published 11 days ago • 140

upvoted a paper 10 days ago

Rethinking the Divergence Regularization in LLM RL

Paper • 2606.09821 • Published 14 days ago • 33

upvoted a collection 12 days ago

Gemma 4

Collection

15 items • Updated 11 days ago • 981

upvoted an article 12 days ago

Article

The Open Source Community is backing OpenEnv for Agentic RL

burtenshaw, spisakjo, lysandre, darktex, willcb, qjoy, pawalt, cwing-nv, danielhanchen, andrewzhou, thegovind, shimmyshimmer, Hamid-Nazeri, Sanyam, zkwentz, emre0, lewtun, sergiopaniego

•

14 days ago

• 89

upvoted a paper 16 days ago

SCOPE: Self-Play via Co-Evolving Policies for Open-Ended Tasks

Paper • 2605.31433 • Published 24 days ago • 28

upvoted a paper 17 days ago

Trust Region On-Policy Distillation

Paper • 2606.01249 • Published 22 days ago • 44

upvoted a collection 18 days ago

Nemotron-Post-Training-v3

Collection

Collection of datasets used in the post-training phase of Nemotron Nano, Super, and Ultra v3. • 50 items • Updated 10 days ago • 160

upvoted an article 23 days ago

Article

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

ariG23498, sayakpaul, sergiopaniego, ror, pcuenq

•

24 days ago

• 122

upvoted a paper 25 days ago

CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents

Paper • 2605.25624 • Published 28 days ago • 34

upvoted a paper 28 days ago

Heterogeneous Agent Collaborative Reinforcement Learning

Paper • 2603.02604 • Published Mar 3 • 198

upvoted a paper 29 days ago

Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text

Paper • 2601.22975 • Published Jan 30 • 113

upvoted a collection about 1 month ago

Mellum

Collection

Series of code models by JetBrains • 12 items • Updated Oct 1, 2025 • 49

upvoted 2 papers about 1 month ago

Reinforcement Learning via Self-Distillation

Paper • 2601.20802 • Published Jan 28 • 50

Embarrassingly Simple Self-Distillation Improves Code Generation

Paper • 2604.01193 • Published Apr 1 • 56

upvoted an article about 1 month ago

Article

TRL v1.0: Post-Training Library Built to Move with the Field

qgallouedec, stevhliu, pcuenq, sergiopaniego

•

Mar 31

• 56

Elena M

AI & ML interests

Recent Activity

Organizations

borntobeignored's activity

Holo3.1: Fast & Local Computer Use Agents

The Open Source Community is backing OpenEnv for Agentic RL

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

TRL v1.0: Post-Training Library Built to Move with the Field