Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2508.14704

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23, 2024 • 18
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4, 2024 • 29
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30, 2024 • 34
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published May 30, 2024 • 7

smol-training-playbook

Running on CPU Upgrade

Featured

3.2k

The Smol Training Playbook

📚

3.2k

The secrets to building world-class LLMs
LLM-in-Sandbox Elicits General Agentic Intelligence

Paper • 2601.16206 • Published Jan 22 • 87
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience

Paper • 2601.15876 • Published Jan 22 • 92
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published Oct 9, 2025 • 40

From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review

Paper • 2504.19678 • Published Apr 28, 2025 • 3
Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions

Paper • 2503.23278 • Published Mar 30, 2025 • 1
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published Aug 28, 2025 • 63
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published Aug 20, 2025 • 43

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published Aug 28, 2025 • 63
MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Paper • 2509.24002 • Published Sep 28, 2025 • 180
TheMCPCompany: Creating General-purpose Agents with Task-specific Tools

Paper • 2510.19286 • Published Oct 22, 2025 • 9
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published Aug 20, 2025 • 43

MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published Aug 20, 2025 • 43
mlfoundations-cua-dev/osworld-trajectories

Viewer • Updated Nov 5, 2025 • 13.7k • 1.24k
xlangai/ubuntu_osworld_verified_trajs

Updated 8 days ago • 1.86k • 14

isaacus/contractual-clause-retrieval

Viewer • Updated Oct 23, 2025 • 225 • 509 • 5
isaacus/echr-retrieval

Viewer • Updated Feb 20 • 600 • 65 • 1
isaacus/gdpr-holdings-retrieval

Viewer • Updated Oct 23, 2025 • 1.5k • 94 • 3
isaacus/license-tldr-retrieval

Viewer • Updated Oct 23, 2025 • 195 • 93 • 1

Foundational & Modern AI Research (Curated)

A curated selection of foundational and modern AI research papers that meaningfully influence how real-world AI systems are designed, evaluated, and g

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 124
Scaling Laws for Neural Language Models

Paper • 2001.08361 • Published Jan 23, 2020 • 10
Training Compute-Optimal Large Language Models

Paper • 2203.15556 • Published Mar 29, 2022 • 11
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT

Paper • 2210.04186 • Published Oct 9, 2022

MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published Aug 20, 2025 • 43

Peer-Reviewed Articles

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22, 2025 • 162
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20, 2024 • 183
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published Aug 20, 2025 • 43
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Paper • 2508.09834 • Published Aug 13, 2025 • 53

From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery

Paper • 2508.14111 • Published Aug 18, 2025 • 33
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published Aug 20, 2025 • 43

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23, 2024 • 18
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4, 2024 • 29
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30, 2024 • 34
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published May 30, 2024 • 7

isaacus/contractual-clause-retrieval

Viewer • Updated Oct 23, 2025 • 225 • 509 • 5
isaacus/echr-retrieval

Viewer • Updated Feb 20 • 600 • 65 • 1
isaacus/gdpr-holdings-retrieval

Viewer • Updated Oct 23, 2025 • 1.5k • 94 • 3
isaacus/license-tldr-retrieval

Viewer • Updated Oct 23, 2025 • 195 • 93 • 1

smol-training-playbook

Running on CPU Upgrade

Featured

3.2k

The Smol Training Playbook

📚

3.2k

The secrets to building world-class LLMs
LLM-in-Sandbox Elicits General Agentic Intelligence

Paper • 2601.16206 • Published Jan 22 • 87
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience

Paper • 2601.15876 • Published Jan 22 • 92
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published Oct 9, 2025 • 40

Foundational & Modern AI Research (Curated)

A curated selection of foundational and modern AI research papers that meaningfully influence how real-world AI systems are designed, evaluated, and g

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 124
Scaling Laws for Neural Language Models

Paper • 2001.08361 • Published Jan 23, 2020 • 10
Training Compute-Optimal Large Language Models

Paper • 2203.15556 • Published Mar 29, 2022 • 11
Analogy Generation by Prompting Large Language Models: A Case Study of InstructGPT

Paper • 2210.04186 • Published Oct 9, 2022

From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review

Paper • 2504.19678 • Published Apr 28, 2025 • 3
Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions

Paper • 2503.23278 • Published Mar 30, 2025 • 1
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published Aug 28, 2025 • 63
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published Aug 20, 2025 • 43

MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published Aug 20, 2025 • 43

MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

Paper • 2508.20453 • Published Aug 28, 2025 • 63
MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Paper • 2509.24002 • Published Sep 28, 2025 • 180
TheMCPCompany: Creating General-purpose Agents with Task-specific Tools

Paper • 2510.19286 • Published Oct 22, 2025 • 9
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published Aug 20, 2025 • 43

Peer-Reviewed Articles

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs

Paper • 2508.16153 • Published Aug 22, 2025 • 162
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20, 2024 • 183
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published Aug 20, 2025 • 43
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Paper • 2508.09834 • Published Aug 13, 2025 • 53

MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published Aug 20, 2025 • 43
mlfoundations-cua-dev/osworld-trajectories

Viewer • Updated Nov 5, 2025 • 13.7k • 1.24k
xlangai/ubuntu_osworld_verified_trajs

Updated 8 days ago • 1.86k • 14

From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery

Paper • 2508.14111 • Published Aug 18, 2025 • 33
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published Aug 20, 2025 • 43

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs