aomar85 commited on
Commit
f7dbbee
·
verified ·
1 Parent(s): 154e15e

Best fold: (F1=0.8602)

Browse files
Files changed (4) hide show
  1. README.md +71 -0
  2. model.safetensors +1 -1
  3. tokenizer.json +0 -0
  4. tokenizer_config.json +23 -0
README.md ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ base_model: aubmindlab/bert-base-arabertv02-twitter
4
+ tags:
5
+ - generated_from_trainer
6
+ metrics:
7
+ - accuracy
8
+ model-index:
9
+ - name: Twitter_concatenatewithPrompt_Augmentation-fold4
10
+ results: []
11
+ ---
12
+
13
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
+ should probably proofread and complete it, then remove this comment. -->
15
+
16
+ # Twitter_concatenatewithPrompt_Augmentation-fold4
17
+
18
+ This model is a fine-tuned version of [aubmindlab/bert-base-arabertv02-twitter](https://huggingface.co/aubmindlab/bert-base-arabertv02-twitter) on an unknown dataset.
19
+ It achieves the following results on the evaluation set:
20
+ - Loss: 0.4110
21
+ - Accuracy: 0.8605
22
+ - Macro F1: 0.8602
23
+ - Weighted F1: 0.8606
24
+ - F1 Pro: 0.8789
25
+ - F1 Against: 0.856
26
+ - F1 Neutral: 0.8458
27
+
28
+ ## Model description
29
+
30
+ More information needed
31
+
32
+ ## Intended uses & limitations
33
+
34
+ More information needed
35
+
36
+ ## Training and evaluation data
37
+
38
+ More information needed
39
+
40
+ ## Training procedure
41
+
42
+ ### Training hyperparameters
43
+
44
+ The following hyperparameters were used during training:
45
+ - learning_rate: 2e-05
46
+ - train_batch_size: 16
47
+ - eval_batch_size: 16
48
+ - seed: 42
49
+ - gradient_accumulation_steps: 2
50
+ - total_train_batch_size: 32
51
+ - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
52
+ - lr_scheduler_type: cosine
53
+ - lr_scheduler_warmup_steps: 0.1
54
+ - num_epochs: 8
55
+ - mixed_precision_training: Native AMP
56
+
57
+ ### Training results
58
+
59
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy | Macro F1 | Weighted F1 | F1 Pro | F1 Against | F1 Neutral |
60
+ |:-------------:|:------:|:----:|:---------------:|:--------:|:--------:|:-----------:|:------:|:----------:|:----------:|
61
+ | 1.7120 | 2.3294 | 100 | 0.5647 | 0.7656 | 0.7664 | 0.7661 | 0.7867 | 0.7479 | 0.7644 |
62
+ | 0.8256 | 4.6588 | 200 | 0.4331 | 0.8427 | 0.8422 | 0.8427 | 0.8610 | 0.8392 | 0.8265 |
63
+ | 0.4433 | 6.9882 | 300 | 0.4109 | 0.8605 | 0.8602 | 0.8606 | 0.8789 | 0.856 | 0.8458 |
64
+
65
+
66
+ ### Framework versions
67
+
68
+ - Transformers 5.0.0
69
+ - Pytorch 2.9.0+cu128
70
+ - Datasets 4.0.0
71
+ - Tokenizers 0.22.2
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4410d4f743ac07a56db8057ffcdf0cf8eb776efde990f3f6ae5299ecdcbdf76e
3
  size 540806124
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a35ae24084973080dad80a806c8f8038a7e52edb08d3c86f6095fcc45ffa5c6b
3
  size 540806124
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "backend": "tokenizers",
3
+ "cls_token": "[CLS]",
4
+ "do_basic_tokenize": true,
5
+ "do_lower_case": false,
6
+ "from_slow": true,
7
+ "is_local": false,
8
+ "mask_token": "[MASK]",
9
+ "max_len": 512,
10
+ "model_max_length": 512,
11
+ "never_split": [
12
+ "[بريد]",
13
+ "[مستخدم]",
14
+ "[رابط]"
15
+ ],
16
+ "pad_token": "[PAD]",
17
+ "sep_token": "[SEP]",
18
+ "strip_accents": null,
19
+ "tokenize_chinese_chars": true,
20
+ "tokenizer_class": "BertTokenizer",
21
+ "unk_token": "[UNK]",
22
+ "use_fast": true
23
+ }