goodpixelltd
/

whisper-tiny-cantonese-coreml

Model card Files Files and versions

whisper-tiny-cantonese-coreml / LICENSE

goodpixelltd's picture

Upload LICENSE with huggingface_hub

64077cd verified 6 months ago

History Blame Contribute Delete

2.34 kB

	# Whisper Tiny Cantonese - CoreML

	CoreML conversion of [Oblivion208/whisper-tiny-cantonese](https://huggingface.co/Oblivion208/whisper-tiny-cantonese) for use with [WhisperKit](https://github.com/argmaxinc/WhisperKit) on Apple devices.

	## Model Description

	This is a CoreML-converted and quantized version of the Whisper Tiny model fine-tuned for Cantonese (粵語) speech recognition. The model outputs colloquial Cantonese (口語) rather than written Chinese (書面語).

	- Base Model: [openai/whisper-tiny](https://huggingface.co/openai/whisper-tiny)
	- Fine-tuned Model: [Oblivion208/whisper-tiny-cantonese](https://huggingface.co/Oblivion208/whisper-tiny-cantonese)
	- Conversion Tool: [argmaxinc/whisperkittools](https://github.com/argmaxinc/whisperkittools)
	- Format: CoreML (.mlmodelc)
	- Quantization: 8-bit palettization (~59MB total)

	## Performance

	- Character Error Rate (CER): 11.10% on MDCC test set (original model)
	- Optimized for: Apple Neural Engine (ANE)

	## License

	This model is distributed under the Apache 2.0 License.

	### Attribution

	This work is based on:

	1. OpenAI Whisper - Original Whisper model
	- License: MIT License
	- Copyright (c) 2022 OpenAI
	- https://github.com/openai/whisper

	2. Oblivion208/whisper-tiny-cantonese - Fine-tuned Cantonese model
	- License: Apache 2.0
	- https://huggingface.co/Oblivion208/whisper-tiny-cantonese
	- Trained on: mozilla-foundation/common_voice_11_0

	3. WhisperKit - CoreML conversion tools
	- License: MIT License
	- Copyright (c) 2024 Argmax, Inc.
	- https://github.com/argmaxinc/WhisperKit

	### Changes Made

	- Converted from PyTorch to CoreML format using whisperkittools
	- Applied 8-bit mixed-precision quantization for reduced model size
	- Packaged with tokenizer and configuration files for WhisperKit compatibility

	## Usage

	This model is designed for use with WhisperKit on iOS/macOS:

	```swift
	import WhisperKit

	let whisperKit = try await WhisperKit(
	modelFolder: "path/to/whisper-tiny-cantonese",
	computeOptions: .init(audioEncoderCompute: .cpuAndNE)
	)

	let result = try await whisperKit.transcribe(audioPath: "audio.wav")
	```

	## Disclaimer

	This model is provided "as is" without warranty of any kind. Performance may vary based on audio quality, accent, and speaking style.