eacortes commited on
Commit
72a728e
·
verified ·
1 Parent(s): be202b2

Upload folder using huggingface_hub

Browse files
Files changed (9) hide show
  1. .gitattributes +1 -0
  2. README.md +253 -0
  3. config.json +70 -0
  4. model.rknn +3 -0
  5. rknn.json +46 -0
  6. special_tokens_map.json +7 -0
  7. tokenizer.json +0 -0
  8. tokenizer_config.json +56 -0
  9. vocab.txt +0 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ model.rknn filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,253 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: apache-2.0
4
+ metrics:
5
+ - squad
6
+ tags:
7
+ - rknn
8
+ - rockchip
9
+ - npu
10
+ - rk-transformers
11
+ - rk3588
12
+ library_name: rk-transformers
13
+ base_model: distilbert/distilbert-base-cased-distilled-squad
14
+ model_name: distilbert-base-cased-distilled-squad
15
+ ---
16
+ # distilbert-base-cased-distilled-squad (RKNN2)
17
+
18
+ > This is an RKNN-compatible version of the [distilbert/distilbert-base-cased-distilled-squad](https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad) model. It has been optimized for Rockchip NPUs using the [rk-transformers](https://github.com/emapco/rk-transformers) library.
19
+
20
+ <details><summary>Click to see the RKNN model details and usage examples</summary>
21
+
22
+ ## Model Details
23
+
24
+ - **Original Model:** [distilbert/distilbert-base-cased-distilled-squad](https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad)
25
+ - **Target Platform:** rk3588
26
+ - **rknn-toolkit2 Version:** 2.3.2
27
+ - **rk-transformers Version:** 0.3.0
28
+
29
+ ### Available Model Files
30
+
31
+ | Model File | Optimization Level | Quantization | File Size |
32
+ | :--------- | :----------------- | :----------- | :-------- |
33
+ | [model.rknn](./model.rknn) | 0 | float16 | 128.1 MB |
34
+
35
+ ## Usage
36
+
37
+ ### Installation
38
+
39
+ Install `rk-transformers` with inference dependencies to use this model:
40
+
41
+ ```bash
42
+ pip install rk-transformers[inference]
43
+ ```
44
+
45
+ #### RK-Transformers API
46
+
47
+ ```python
48
+ from rktransformers import RKModelForQuestionAnswering
49
+ from transformers import AutoTokenizer
50
+
51
+ tokenizer = AutoTokenizer.from_pretrained("rk-transformers/distilbert-base-cased-distilled-squad")
52
+ model = RKModelForQuestionAnswering.from_pretrained(
53
+ "rk-transformers/distilbert-base-cased-distilled-squad",
54
+ platform="rk3588",
55
+ core_mask="auto",
56
+ )
57
+
58
+ question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"
59
+ inputs = tokenizer(question, text, return_tensors="np")
60
+ outputs = model(**inputs)
61
+ start_logits = outputs.start_logits
62
+ end_logits = outputs.end_logits
63
+ print(start_logits.shape)
64
+ print(end_logits.shape)
65
+ ```
66
+
67
+ ## Configuration
68
+
69
+ The full configuration for all exported RKNN models is available in the [config.json](./config.json) file.
70
+
71
+ </details>
72
+
73
+ ---
74
+
75
+ # DistilBERT base cased distilled SQuAD
76
+
77
+ ## Table of Contents
78
+ - [Model Details](#model-details)
79
+ - [How To Get Started With the Model](#how-to-get-started-with-the-model)
80
+ - [Uses](#uses)
81
+ - [Risks, Limitations and Biases](#risks-limitations-and-biases)
82
+ - [Training](#training)
83
+ - [Evaluation](#evaluation)
84
+ - [Environmental Impact](#environmental-impact)
85
+ - [Technical Specifications](#technical-specifications)
86
+ - [Citation Information](#citation-information)
87
+ - [Model Card Authors](#model-card-authors)
88
+
89
+ ## Model Details
90
+
91
+ **Model Description:** The DistilBERT model was proposed in the blog post [Smaller, faster, cheaper, lighter: Introducing DistilBERT, adistilled version of BERT](https://medium.com/huggingface/distilbert-8cf3380435b5), and the paper [DistilBERT, adistilled version of BERT: smaller, faster, cheaper and lighter](https://arxiv.org/abs/1910.01108). DistilBERT is a small, fast, cheap and light Transformer model trained by distilling BERT base. It has 40% less parameters than *bert-base-uncased*, runs 60% faster while preserving over 95% of BERT's performances as measured on the GLUE language understanding benchmark.
92
+
93
+ This model is a fine-tune checkpoint of [DistilBERT-base-cased](https://huggingface.co/distilbert-base-cased), fine-tuned using (a second step of) knowledge distillation on [SQuAD v1.1](https://huggingface.co/datasets/squad).
94
+
95
+ - **Developed by:** Hugging Face
96
+ - **Model Type:** Transformer-based language model
97
+ - **Language(s):** English
98
+ - **License:** Apache 2.0
99
+ - **Related Models:** [DistilBERT-base-cased](https://huggingface.co/distilbert-base-cased)
100
+ - **Resources for more information:**
101
+ - See [this repository](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation) for more about Distil\* (a class of compressed models including this model)
102
+ - See [Sanh et al. (2019)](https://arxiv.org/abs/1910.01108) for more information about knowledge distillation and the training procedure
103
+
104
+ ## How to Get Started with the Model
105
+
106
+ Use the code below to get started with the model.
107
+
108
+ ```python
109
+ >>> from transformers import pipeline
110
+ >>> question_answerer = pipeline("question-answering", model='distilbert-base-cased-distilled-squad')
111
+
112
+ >>> context = r"""
113
+ ... Extractive Question Answering is the task of extracting an answer from a text given a question. An example of a
114
+ ... question answering dataset is the SQuAD dataset, which is entirely based on that task. If you would like to fine-tune
115
+ ... a model on a SQuAD task, you may leverage the examples/pytorch/question-answering/run_squad.py script.
116
+ ... """
117
+
118
+ >>> result = question_answerer(question="What is a good example of a question answering dataset?", context=context)
119
+ >>> print(
120
+ ... f"Answer: '{result['answer']}', score: {round(result['score'], 4)}, start: {result['start']}, end: {result['end']}"
121
+ ...)
122
+
123
+ Answer: 'SQuAD dataset', score: 0.5152, start: 147, end: 160
124
+ ```
125
+
126
+ Here is how to use this model in PyTorch:
127
+
128
+ ```python
129
+ from transformers import DistilBertTokenizer, DistilBertModel
130
+ import torch
131
+ tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-cased-distilled-squad')
132
+ model = DistilBertModel.from_pretrained('distilbert-base-cased-distilled-squad')
133
+
134
+ question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"
135
+
136
+ inputs = tokenizer(question, text, return_tensors="pt")
137
+ with torch.no_grad():
138
+ outputs = model(**inputs)
139
+
140
+ print(outputs)
141
+ ```
142
+
143
+ And in TensorFlow:
144
+
145
+ ```python
146
+ from transformers import DistilBertTokenizer, TFDistilBertForQuestionAnswering
147
+ import tensorflow as tf
148
+
149
+ tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-cased-distilled-squad")
150
+ model = TFDistilBertForQuestionAnswering.from_pretrained("distilbert-base-cased-distilled-squad")
151
+
152
+ question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"
153
+
154
+ inputs = tokenizer(question, text, return_tensors="tf")
155
+ outputs = model(**inputs)
156
+
157
+ answer_start_index = int(tf.math.argmax(outputs.start_logits, axis=-1)[0])
158
+ answer_end_index = int(tf.math.argmax(outputs.end_logits, axis=-1)[0])
159
+
160
+ predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
161
+ tokenizer.decode(predict_answer_tokens)
162
+ ```
163
+
164
+ ## Uses
165
+
166
+ This model can be used for question answering.
167
+
168
+ #### Misuse and Out-of-scope Use
169
+
170
+ The model should not be used to intentionally create hostile or alienating environments for people. In addition, the model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.
171
+
172
+ ## Risks, Limitations and Biases
173
+
174
+ **CONTENT WARNING: Readers should be aware that language generated by this model can be disturbing or offensive to some and can propagate historical and current stereotypes.**
175
+
176
+ Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model can include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups. For example:
177
+
178
+
179
+ ```python
180
+ >>> from transformers import pipeline
181
+ >>> question_answerer = pipeline("question-answering", model='distilbert-base-cased-distilled-squad')
182
+
183
+ >>> context = r"""
184
+ ... Alice is sitting on the bench. Bob is sitting next to her.
185
+ ... """
186
+
187
+ >>> result = question_answerer(question="Who is the CEO?", context=context)
188
+ >>> print(
189
+ ... f"Answer: '{result['answer']}', score: {round(result['score'], 4)}, start: {result['start']}, end: {result['end']}"
190
+ ...)
191
+
192
+ Answer: 'Bob', score: 0.7527, start: 32, end: 35
193
+ ```
194
+
195
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
196
+
197
+ ## Training
198
+
199
+ #### Training Data
200
+
201
+ The [distilbert-base-cased model](https://huggingface.co/distilbert-base-cased) was trained using the same data as the [distilbert-base-uncased model](https://huggingface.co/distilbert-base-uncased). The [distilbert-base-uncased model](https://huggingface.co/distilbert-base-uncased) model describes it's training data as:
202
+
203
+ > DistilBERT pretrained on the same data as BERT, which is [BookCorpus](https://yknzhu.wixsite.com/mbweb), a dataset consisting of 11,038 unpublished books and [English Wikipedia](https://en.wikipedia.org/wiki/English_Wikipedia) (excluding lists, tables and headers).
204
+
205
+ To learn more about the SQuAD v1.1 dataset, see the [SQuAD v1.1 data card](https://huggingface.co/datasets/squad).
206
+
207
+ #### Training Procedure
208
+
209
+ ##### Preprocessing
210
+
211
+ See the [distilbert-base-cased model card](https://huggingface.co/distilbert-base-cased) for further details.
212
+
213
+ ##### Pretraining
214
+
215
+ See the [distilbert-base-cased model card](https://huggingface.co/distilbert-base-cased) for further details.
216
+
217
+ ## Evaluation
218
+
219
+ As discussed in the [model repository](https://github.com/huggingface/transformers/blob/main/examples/research_projects/distillation/README.md)
220
+
221
+ > This model reaches a F1 score of 87.1 on the [SQuAD v1.1] dev set (for comparison, BERT bert-base-cased version reaches a F1 score of 88.7).
222
+
223
+ ## Environmental Impact
224
+
225
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). We present the hardware type and hours used based on the [associated paper](https://arxiv.org/pdf/1910.01108.pdf). Note that these details are just for training DistilBERT, not including the fine-tuning with SQuAD.
226
+
227
+ - **Hardware Type:** 8 16GB V100 GPUs
228
+ - **Hours used:** 90 hours
229
+ - **Cloud Provider:** Unknown
230
+ - **Compute Region:** Unknown
231
+ - **Carbon Emitted:** Unknown
232
+
233
+ ## Technical Specifications
234
+
235
+ See the [associated paper](https://arxiv.org/abs/1910.01108) for details on the modeling architecture, objective, compute infrastructure, and training details.
236
+
237
+ ## Citation Information
238
+
239
+ ```bibtex
240
+ @inproceedings{sanh2019distilbert,
241
+ title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},
242
+ author={Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas},
243
+ booktitle={NeurIPS EMC^2 Workshop},
244
+ year={2019}
245
+ }
246
+ ```
247
+
248
+ APA:
249
+ - Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
250
+
251
+ ## Model Card Authors
252
+
253
+ This model card was written by the Hugging Face team.
config.json ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "activation": "gelu",
3
+ "architectures": [
4
+ "DistilBertForQuestionAnswering"
5
+ ],
6
+ "attention_dropout": 0.1,
7
+ "dim": 768,
8
+ "dropout": 0.1,
9
+ "hidden_dim": 3072,
10
+ "initializer_range": 0.02,
11
+ "max_position_embeddings": 512,
12
+ "model_type": "distilbert",
13
+ "n_heads": 12,
14
+ "n_layers": 6,
15
+ "output_past": true,
16
+ "pad_token_id": 0,
17
+ "qa_dropout": 0.1,
18
+ "rknn": {
19
+ "model.rknn": {
20
+ "batch_size": 1,
21
+ "custom_string": null,
22
+ "dynamic_input": null,
23
+ "float_dtype": "float16",
24
+ "inputs_yuv_fmt": null,
25
+ "max_seq_length": 512,
26
+ "mean_values": null,
27
+ "model_input_names": [
28
+ "input_ids",
29
+ "attention_mask"
30
+ ],
31
+ "opset": 19,
32
+ "optimization": {
33
+ "compress_weight": false,
34
+ "enable_flash_attention": true,
35
+ "model_pruning": false,
36
+ "optimization_level": 0,
37
+ "remove_reshape": false,
38
+ "remove_weight": false,
39
+ "sparse_infer": false
40
+ },
41
+ "quantization": {
42
+ "auto_hybrid_cos_thresh": 0.98,
43
+ "auto_hybrid_euc_thresh": null,
44
+ "dataset_columns": null,
45
+ "dataset_name": null,
46
+ "dataset_size": 128,
47
+ "dataset_split": null,
48
+ "dataset_subset": null,
49
+ "do_quantization": false,
50
+ "quant_img_RGB2BGR": false,
51
+ "quantized_algorithm": "normal",
52
+ "quantized_dtype": "w8a8",
53
+ "quantized_hybrid_level": 0,
54
+ "quantized_method": "channel"
55
+ },
56
+ "rktransformers_version": "0.3.0",
57
+ "single_core_mode": false,
58
+ "std_values": null,
59
+ "target_platform": "rk3588",
60
+ "task": "question-answering",
61
+ "task_kwargs": null
62
+ }
63
+ },
64
+ "seq_classif_dropout": 0.2,
65
+ "sinusoidal_pos_embds": true,
66
+ "tie_weights_": true,
67
+ "torch_dtype": "float32",
68
+ "transformers_version": "4.55.4",
69
+ "vocab_size": 28996
70
+ }
model.rknn ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dad1a2ee01bf140c4e594c07595f17b0fece86ebca6f22c58d53fb71fab0c28a
3
+ size 134359812
rknn.json ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model.rknn": {
3
+ "rktransformers_version": "0.2.0",
4
+ "model_input_names": [
5
+ "input_ids",
6
+ "attention_mask"
7
+ ],
8
+ "batch_size": 1,
9
+ "max_seq_length": 512,
10
+ "task_kwargs": null,
11
+ "float_dtype": "float16",
12
+ "target_platform": "rk3588",
13
+ "single_core_mode": false,
14
+ "mean_values": null,
15
+ "std_values": null,
16
+ "custom_string": null,
17
+ "inputs_yuv_fmt": null,
18
+ "dynamic_input": null,
19
+ "opset": 19,
20
+ "task": "question-answering",
21
+ "quantization": {
22
+ "do_quantization": false,
23
+ "dataset_name": null,
24
+ "dataset_subset": null,
25
+ "dataset_size": 128,
26
+ "dataset_split": null,
27
+ "dataset_columns": null,
28
+ "quantized_dtype": "w8a8",
29
+ "quantized_algorithm": "normal",
30
+ "quantized_method": "channel",
31
+ "quantized_hybrid_level": 0,
32
+ "quant_img_RGB2BGR": false,
33
+ "auto_hybrid_cos_thresh": 0.98,
34
+ "auto_hybrid_euc_thresh": null
35
+ },
36
+ "optimization": {
37
+ "optimization_level": 0,
38
+ "enable_flash_attention": true,
39
+ "remove_weight": false,
40
+ "compress_weight": false,
41
+ "remove_reshape": false,
42
+ "sparse_infer": false,
43
+ "model_pruning": false
44
+ }
45
+ }
46
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": false,
45
+ "cls_token": "[CLS]",
46
+ "do_lower_case": false,
47
+ "extra_special_tokens": {},
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "pad_token": "[PAD]",
51
+ "sep_token": "[SEP]",
52
+ "strip_accents": null,
53
+ "tokenize_chinese_chars": true,
54
+ "tokenizer_class": "DistilBertTokenizer",
55
+ "unk_token": "[UNK]"
56
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff