File size: 4,262 Bytes
f717f9b | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 | ---
base_model: tencent/HY-MT1.5-1.8B
base_model_relation: quantized
library_name: mnn
license: other
license_name: tencent-hunyuan-community
license_link: https://huggingface.co/tencent/HY-MT1.5-1.8B/blob/main/LICENSE
language:
- multilingual
- en
- zh
- ja
- ko
- de
- fr
- es
- pt
- ru
- ar
tags:
- translation
- mnn
- quantized
- 4-bit
- apple-silicon
- edge-inference
- ios
- macos
- mobile
pipeline_tag: translation
---
# HY-MT1.5-1.8B-MNN
This is a 4-bit quantized MNN version of [Tencent's HY-MT1.5-1.8B](https://huggingface.co/tencent/HY-MT1.5-1.8B) translation model, optimized for Apple Silicon (iOS/macOS) edge inference.
## Model Description
HY-MT1.5-1.8B is a lightweight version of the HY-MT1.5 series, specifically designed for edge devices:
- **36 Language Support**: Extended language coverage
- **Edge Optimized**: Designed for mobile and edge deployment
- **Terminology Intervention**: Custom terminology control during translation
- **Context-Aware Translation**: Improved accuracy with context understanding
- **Industry-Leading Performance**: Best-in-class for its parameter size
## Quantization Details
| Property | Value |
|----------|-------|
| Original Model | [tencent/HY-MT1.5-1.8B](https://huggingface.co/tencent/HY-MT1.5-1.8B) |
| Original Size | ~3.8 GB |
| Quantized Size | **1.07 GB** |
| Compression Ratio | 72% |
| Quantization Type | 4-bit (q4_k_m) |
| Block Size | 64 |
## Hardware Acceleration
Optimized for Apple Silicon with:
- β
INT8 Dot Product (i8sdot)
- β
FP16 Operations
- β
INT8 Matrix Multiply (i8mm)
- β
Scalable Matrix Extension 2 (sme2)
- β
Metal GPU Acceleration
- β
Apple Neural Engine (ANE) compatible
## Files
```
βββ llm.mnn # Model structure (576 KB)
βββ llm.mnn.weight # Quantized weights (1.07 GB)
βββ tokenizer.txt # Tokenizer vocabulary
βββ llm_config.json # MNN runtime config
βββ config.json # Model config
βββ model_info.json # Model metadata
βββ export_args.json # Conversion parameters
```
## Usage
### With MNN LLM Demo
```bash
# Clone MNN and build llm_demo
git clone https://github.com/alibaba/MNN.git
cd MNN && mkdir build && cd build
cmake .. -DMNN_BUILD_LLM=ON -DMNN_LOW_MEMORY=ON
make -j8 llm_demo
# Run inference
cd /path/to/HY-MT1.5-1.8B-MNN
./llm_demo ./
```
### Example
```
User: Translate into English: δ»ε€©ε€©ζ°£εΎε₯½
A: The weather is very nice today.
```
### Prompt Templates
```
# Basic translation
Translate into {language}:
{text}
# With terminology
Translate into {language}, using terms: {terms}
{text}
# With context
Context: {context}
Translate into {language}:
{text}
```
## Performance
| Metric | Value |
|--------|-------|
| Model Load Time | ~1s |
| Inference Speed | 40-60 tokens/s |
| Target Device | iOS / Apple Silicon |
| Memory Usage | < 2GB |
## iOS Integration
This model is ideal for iOS apps. Example using MNN iOS SDK:
```swift
import MNN
let llm = LLM(modelPath: "HY-MT1.5-1.8B-MNN")
let result = llm.generate("Translate into English: δ»ε€©ε€©ζ°£εΎε₯½")
print(result) // "The weather is very nice today."
```
## Conversion Info
- **Tool**: MNN llmexport.py
- **MNN Version**: 3.0.0
- **Conversion Date**: 2025-12-31
- **Source Format**: HuggingFace safetensors
## Related Models
- [HY-MT1.5-7B-MNN](https://huggingface.co/jazzwang/HY-MT1.5-7B-MNN) - Larger version for higher quality
- [Hunyuan-MT-7B-MNN](https://huggingface.co/jazzwang/Hunyuan-MT-7B-MNN) - Original WMT25 version
## Why Choose 1.8B?
| Feature | 1.8B | 7B |
|---------|------|-----|
| Size | 1.07 GB | 4.47 GB |
| Speed | 40-60 tok/s | 20-30 tok/s |
| iOS Compatible | β
Yes | β οΈ Mac only |
| Quality | Good | Excellent |
**Choose 1.8B for**: Mobile apps, real-time translation, resource-constrained devices
**Choose 7B for**: Desktop apps, highest translation quality, batch processing
## License
This model inherits the license from the original [HY-MT1.5-1.8B](https://huggingface.co/tencent/HY-MT1.5-1.8B) model.
## Acknowledgments
- [Tencent Hunyuan Team](https://huggingface.co/tencent) for the original model
- [Alibaba MNN Team](https://github.com/alibaba/MNN) for the inference framework
|