Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
tuandunghcmut
's Collections
Gemma 4 Text-Only
Qwen3.5 Text-Only
MT-LLM
Agentic Benchmarks
Safety SFT
Tool Calling dataset for search domain
Document Layout Analysis Dataset
Post-training Dataset
RL-Papers
Visual Chain-of-Thought Reasoning Benchmarks
LLM for Security Benchmarks/Datasets
Visual-CoT/GCoT related
Text Embedding Papers
EMPTY A
Quantized versions of LLMs/MLLMs
Multilingual Sentiment Analysis Dataset
LLM Series
LLM/MLLM (20B - 80B, fit on 1-2 A100/H100)
SLM
MLLM (100B - 300B)
Benchmarks for evaluating LLMs/MLLMs
Conversation Dataset
Multilingual Parallel Text Corpus
Multilingual Pretraining Corpus for Southeast Asian Language
Multilingual Pretraining Corpus for Southeast Asian Language
updated
Mar 26
Upvote
-
Sort: Collection
aisingapore/SEA-PILE-v2
Viewer
•
Updated
Apr 14, 2025
•
187M
•
705
•
6
aisingapore/SEA-PILE-v1
Viewer
•
Updated
Dec 2, 2025
•
636M
•
168
•
18
airesearch/scb_mt_enth_2020
Updated
Jan 18, 2024
•
200
•
9
aisingapore/WangchanLION-Web
Viewer
•
Updated
Sep 3, 2025
•
19.8M
•
838
•
3
aisingapore/WangchanLION-Curated
Viewer
•
Updated
Sep 3, 2025
•
402k
•
166
•
3
tuandunghcmut/PhoMT-MTet-Mixture
Viewer
•
Updated
Aug 11, 2025
•
7.62M
•
104
•
2
HuggingFaceFW/clean-wikipedia
Viewer
•
Updated
Oct 21, 2025
•
61.2M
•
4.53k
•
24
uonlp/CulturaX
Viewer
•
Updated
Dec 16, 2024
•
7.18B
•
18.4k
•
643
allenai/c4
Viewer
•
Updated
Jan 9, 2024
•
10.4B
•
1.25M
•
604
Upvote
-
Sort: Collection
Share collection
View history
Collection guide
Browse collections