bennysee commited on
Commit
c48e260
·
verified ·
1 Parent(s): cc9bd11

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +38 -0
README.md ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: am
3
+ license: mit
4
+ tags:
5
+ - gpt2
6
+ - amharic
7
+ - onnx
8
+ - quantized
9
+ - q8
10
+ ---
11
+
12
+ # GPT-2 Amharic - Quantized ONNX (Q8)
13
+
14
+ This is a quantized version of rasyosef/gpt2-small-amharic converted to ONNX with 8-bit quantization.
15
+
16
+ ## Model Size
17
+ - Original PyTorch: ~550 MB
18
+ - FP32 ONNX: 130.83 MB
19
+ - Q8 ONNX: 34.74 MB (93.7 percent smaller)
20
+
21
+ ## Usage
22
+
23
+ ```python
24
+ import onnxruntime as ort
25
+ from transformers import AutoTokenizer
26
+
27
+ tokenizer = AutoTokenizer.from_pretrained('bennysee/gpt2-amharic-onnx-q8')
28
+ session = ort.InferenceSession('bennysee/gpt2-amharic-onnx-q8/model_q8.onnx')
29
+
30
+ text = 'ሰላም'
31
+ inputs = tokenizer(text, return_tensors='np')
32
+ outputs = session.run(None, {'input_ids': inputs['input_ids']})
33
+ ```
34
+
35
+ ## Files
36
+ - model_q8.onnx (33.24 MB) - Quantized model
37
+ - tokenizer.json - Tokenizer vocabulary
38
+ - tokenizer_config.json - Tokenizer config