Training Datasets (Voice Models) amphion/Emilia-Dataset Viewer • Updated Feb 28, 2025 • 54.8M • 44.4k • 459
Training Datasets (Language Models) allenai/dolma Updated Apr 17, 2024 • 4.08k • 1.04k OpenCoder-LLM/opc-sft-stage1 Viewer • Updated Nov 24, 2024 • 4.22M • 1.23k • 74 OpenCoder-LLM/opc-fineweb-math-corpus Viewer • Updated Nov 24, 2024 • 5.24M • 1.02k • 31 OpenCoder-LLM/opc-fineweb-code-corpus Viewer • Updated Nov 24, 2024 • 101M • 4.79k • 52
Training Datasets (Voice Models) amphion/Emilia-Dataset Viewer • Updated Feb 28, 2025 • 54.8M • 44.4k • 459
Training Datasets (Language Models) allenai/dolma Updated Apr 17, 2024 • 4.08k • 1.04k OpenCoder-LLM/opc-sft-stage1 Viewer • Updated Nov 24, 2024 • 4.22M • 1.23k • 74 OpenCoder-LLM/opc-fineweb-math-corpus Viewer • Updated Nov 24, 2024 • 5.24M • 1.02k • 31 OpenCoder-LLM/opc-fineweb-code-corpus Viewer • Updated Nov 24, 2024 • 101M • 4.79k • 52