Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
mangaba-ai 's Collections
Edicao de Imagem
Musica & Audio
Video
Codigo / Coding LLMs
Embeddings & RAG
Visao / Multimodal (VLM)
Voz (TTS + STT)
Imagem
Texto / LLM

Voz (TTS + STT)

updated 10 days ago

Melhores modelos open-source de voz: sintese (TTS) e ASR (STT). · Curadoria Mangaba AI 🥭

Upvote
-

  • Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice

    Text-to-Speech • 2B • Updated Jan 29 • 2.14M • 1.63k

  • openbmb/VoxCPM2

    Text-to-Speech • 2B • Updated Apr 16 • 529k • 1.42k

  • Supertone/supertonic-3

    Text-to-Speech • Updated May 18 • 53.9k • 851

  • Qwen/Qwen3-ASR-1.7B

    Automatic Speech Recognition • 2B • Updated Jan 30 • 1.62M • 898

  • nvidia/parakeet-tdt-0.6b-v3

    Automatic Speech Recognition • 0.6B • Updated May 20 • 172k • • 941

  • CohereLabs/cohere-transcribe-03-2026

    Automatic Speech Recognition • 2B • Updated 14 days ago • 736k • 1.01k

  • hexgrad/Kokoro-82M

    Text-to-Speech • Updated Apr 10, 2025 • 16.2M • • 6.38k

  • k2-fsa/OmniVoice

    Text-to-Speech • 0.6B • Updated May 7 • 1.26M • 1.07k

  • fishaudio/s2-pro

    Text-to-Speech • 5B • Updated Mar 11 • 369k • 1.05k
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs