A multilingual news corpus built from Common Crawl CC-News, indexed and queriable in milliseconds, cleaned an enriched with language and topic id
Ruggero Marino Lazzaroni
ruggsea
AI & ML interests
NLP in any form
Recent Activity
liked a model about 18 hours ago
Eriskii/LFM2.5-8B-A1B-Multichannel updated a dataset 3 days ago
ruggsea/social-sim-bench-gens published a dataset 24 days ago
ruggsea/social-sim-bench-gens