# Whisper Tiny Cantonese - CoreML

CoreML conversion of [Oblivion208/whisper-tiny-cantonese](https://huggingface.co/Oblivion208/whisper-tiny-cantonese) for use with [WhisperKit](https://github.com/argmaxinc/WhisperKit) on Apple devices.

## Model Description

This is a CoreML-converted and quantized version of the Whisper Tiny model fine-tuned for Cantonese (粵語) speech recognition. The model outputs colloquial Cantonese (口語) rather than written Chinese (書面語).

- **Base Model**: [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny)
- **Fine-tuned Model**: [Oblivion208/whisper-tiny-cantonese](https://huggingface.co/Oblivion208/whisper-tiny-cantonese)
- **Conversion Tool**: [argmaxinc/whisperkittools](https://github.com/argmaxinc/whisperkittools)
- **Format**: CoreML (.mlmodelc)
- **Quantization**: 8-bit palettization (~59MB total)

## Performance

- **Character Error Rate (CER)**: 11.10% on MDCC test set (original model)
- **Optimized for**: Apple Neural Engine (ANE)

## License

This model is distributed under the **Apache 2.0 License**.

### Attribution

This work is based on:

1. **OpenAI Whisper** - Original Whisper model
   - License: MIT License
   - Copyright (c) 2022 OpenAI
   - https://github.com/openai/whisper

2. **Oblivion208/whisper-tiny-cantonese** - Fine-tuned Cantonese model
   - License: Apache 2.0
   - https://huggingface.co/Oblivion208/whisper-tiny-cantonese
   - Trained on: mozilla-foundation/common_voice_11_0

3. **WhisperKit** - CoreML conversion tools
   - License: MIT License
   - Copyright (c) 2024 Argmax, Inc.
   - https://github.com/argmaxinc/WhisperKit

### Changes Made

- Converted from PyTorch to CoreML format using whisperkittools
- Applied 8-bit mixed-precision quantization for reduced model size
- Packaged with tokenizer and configuration files for WhisperKit compatibility

## Usage

This model is designed for use with WhisperKit on iOS/macOS:

```swift
import WhisperKit

let whisperKit = try await WhisperKit(
    modelFolder: "path/to/whisper-tiny-cantonese",
    computeOptions: .init(audioEncoderCompute: .cpuAndNE)
)

let result = try await whisperKit.transcribe(audioPath: "audio.wav")
```

## Disclaimer

This model is provided "as is" without warranty of any kind. Performance may vary based on audio quality, accent, and speaking style.