nvidia
/

parakeet-tdt-0.6b-v3

@@ -805,6 +805,7 @@ img {
 **Supported Languages:**
 Bulgarian (**bg**), Croatian (**hr**), Czech (**cs**), Danish (**da**), Dutch (**nl**), English (**en**), Estonian (**et**), Finnish (**fi**), French (**fr**), German (**de**), Greek (**el**), Hungarian (**hu**), Italian (**it**), Latvian (**lv**), Lithuanian (**lt**), Maltese (**mt**), Polish (**pl**), Portuguese (**pt**), Romanian (**ro**), Slovak (**sk**), Slovenian (**sl**), Spanish (**es**), Swedish (**sv**), Russian (**ru**), Ukrainian (**uk**)
 ## <span style="color:#466f00;">Key Features:</span>
@@ -815,9 +816,9 @@ Bulgarian (**bg**), Croatian (**hr**), Czech (**cs**), Danish (**da**), Dutch (*
 * **Long audio** transcription, supporting audio **up to 24 minutes** long with full attention (on A100 80GB) or up to 3 hours with local attention.
 * Released under a **permissive CC BY 4.0 license**
-This model is ready for commercial/non-commercial use.
----
 ## Automatic Speech Recognition (ASR) Performance
@@ -833,11 +834,6 @@ This model is ready for commercial/non-commercial use.
 **Note 2:** Performance differences may be partly attributed to Portuguese variant differences - our training data uses European Portuguese while most benchmarks use Brazilian Portuguese.
-## <span style="color:#466f00;">License/Terms of Use:</span>
-GOVERNING TERMS: Use of this model is governed by the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode.en) license.
 ### <span style="color:#466f00;">Deployment Geography:</span>
 Global
@@ -849,7 +845,8 @@ This model serves developers, researchers, academics, and industries building ap
 ### <span style="color:#466f00;">Release Date:</span>
-08/14/2025
 ### <span style="color:#466f00;">Model Architecture:</span>
@@ -936,7 +933,7 @@ print(output[0].text)
 ## <span style="color:#466f00;">Software Integration:</span>
 **Runtime Engine(s):**
-* NeMo 2.5
 **Supported Hardware Microarchitecture Compatibility:**
@@ -1136,4 +1133,47 @@ NVIDIA believes Trustworthy AI is a shared responsibility and we have establishe
 For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards [here](https://developer.nvidia.com/blog/enhancing-ai-transparency-and-ethical-considerations-with-model-card/).
-Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).

 **Supported Languages:**
 Bulgarian (**bg**), Croatian (**hr**), Czech (**cs**), Danish (**da**), Dutch (**nl**), English (**en**), Estonian (**et**), Finnish (**fi**), French (**fr**), German (**de**), Greek (**el**), Hungarian (**hu**), Italian (**it**), Latvian (**lv**), Lithuanian (**lt**), Maltese (**mt**), Polish (**pl**), Portuguese (**pt**), Romanian (**ro**), Slovak (**sk**), Slovenian (**sl**), Spanish (**es**), Swedish (**sv**), Russian (**ru**), Ukrainian (**uk**)
+This model is ready for commercial/non-commercial use.
 ## <span style="color:#466f00;">Key Features:</span>
 * **Long audio** transcription, supporting audio **up to 24 minutes** long with full attention (on A100 80GB) or up to 3 hours with local attention.
 * Released under a **permissive CC BY 4.0 license**
+## <span style="color:#466f00;">License/Terms of Use:</span>
+GOVERNING TERMS: Use of this model is governed by the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode.en) license.
 ## Automatic Speech Recognition (ASR) Performance
 **Note 2:** Performance differences may be partly attributed to Portuguese variant differences - our training data uses European Portuguese while most benchmarks use Brazilian Portuguese.
 ### <span style="color:#466f00;">Deployment Geography:</span>
 Global
 ### <span style="color:#466f00;">Release Date:</span>
+Huggingface [08/14/2025](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v3)
 ### <span style="color:#466f00;">Model Architecture:</span>
 ## <span style="color:#466f00;">Software Integration:</span>
 **Runtime Engine(s):**
+* NeMo 2.4
 **Supported Hardware Microarchitecture Compatibility:**
 For more detailed information on ethical considerations for this model, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards [here](https://developer.nvidia.com/blog/enhancing-ai-transparency-and-ethical-considerations-with-model-card/).
+Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
+## <span style="color:#466f00;">Bias:</span>
+Field                                                                                               |  Response
+---------------------------------------------------------------------------------------------------|---------------
+Participation considerations from adversely impacted groups [protected classes](https://www.senate.ca.gov/content/protected-classes) in model design and testing  |  None
+Measures taken to mitigate against unwanted bias    | None
+## <span style="color:#466f00;">Explainability:</span>
+Field                                                                                                  |  Response
+------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------
+Intended Domain                                                                   |  Speech to Text Transcription
+Model Type                                                                                            |  FastConformer
+Intended Users                                                                                        |  This model is intended for developers, researchers, academics, and industries building conversational based applications.
+Output                                                                                                |  Text
+Describe how the model works                                                                          |  Speech input is encoded into embeddings and passed into conformer-based model and output a text response.
+Name the adversely impacted groups this has been tested to deliver comparable outcomes regardless of  |  Not Applicable
+Technical Limitations & Mitigation                                                                    |  Transcripts may be not 100% accurate. Accuracy varies based on language and characteristics of input audio (Domain, Use Case, Accent, Noise, Speech Type, Context of speech, etc.)
+Verified to have met prescribed NVIDIA quality standards  |  Yes
+Performance Metrics                                                                                   | Word Error Rate
+Potential Known Risks                                                                                 |  If a word is not trained in the language model and not presented in vocabulary, the word is not likely to be recognized. Not recommended for word-for-word/incomplete sentences as accuracy varies based on the context of input text
+Licensing                                                                                             |  GOVERNING TERMS: Use of this model is governed by the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode.en) license.
+## <span style="color:#466f00;">Privacy:</span>
+Field                                                                                                                              |  Response
+----------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------
+Generatable or reverse engineerable personal data?                                                     |  None
+Personal data used to create this model?                                                                                       |  None
+Is there provenance for all datasets used in training?                                                                                |  Yes
+Does data labeling (annotation, metadata) comply with privacy laws?                                                                |  Yes
+Is data compliant with data subject requests for data correction or removal, if such a request was made?                           |  No, not possible with externally-sourced data.
+Applicable Privacy Policy        | https://www.nvidia.com/en-us/about-nvidia/privacy-policy/
+## <span style="color:#466f00;">Safety:</span>
+Field                                               |  Response
+---------------------------------------------------|----------------------------------
+Model Application(s)                               |  Speech to Text Transcription
+Describe the life critical impact   |  None
+Use Case Restrictions                              | Abide by [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode.en) License
+Model and dataset restrictions            |  The Principle of least privilege (PoLP) is applied limiting access for dataset generation and model development. Restrictions enforce dataset access during training, and dataset license constraints adhered to.