cartesinus commited on
Commit
c2b369a
·
1 Parent(s): b33d256

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -3
README.md CHANGED
@@ -1,9 +1,88 @@
1
  ---
2
  library_name: peft
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
4
- ## Training procedure
5
 
6
- ### Framework versions
7
 
 
8
 
9
- - PEFT 0.5.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  library_name: peft
3
+ license: mit
4
+ datasets:
5
+ - iva_mt_wslot
6
+ metrics:
7
+ - bleu
8
+ model-index:
9
+ - name: iva_mt_wslot-m2m100_418M-en-pl-lora_adapter
10
+ results:
11
+ - task:
12
+ name: Sequence-to-sequence Language Modeling
13
+ type: text2text-generation
14
+ dataset:
15
+ name: iva_mt_wslot
16
+ type: iva_mt_wslot
17
+ config: en-pl
18
+ split: validation
19
+ args: en-pl
20
+ metrics:
21
+ - name: Bleu
22
+ type: bleu
23
+ value: 38.2365
24
+ language:
25
+ - pl
26
+ tags:
27
+ - machine translation
28
+ - iva
29
+ - virtual assistants
30
+ - natural language understanding
31
+ - nlu
32
  ---
 
33
 
34
+ # (WIP!) iva_mt_wslot-m2m100_418M-en-pl-lora_adapter
35
 
36
+ Notice: **Although training results are good for some reason inference results are rather poor. I'm leaving this model here as a PoC that PERF LORA adaptation for M2M100 is possible.**
37
 
38
+ This model is a LORA adapted version of [facebook/m2m100_418M](https://huggingface.co/facebook/m2m100_418M) on the iva_mt_wslot dataset.
39
+ It achieves the following results on the test set (measured with sacrebleu):
40
+ - Bleu: 9.33
41
+
42
+ ## Using
43
+
44
+ The model can be used as follows:
45
+
46
+ First, clone the repository and navigate to the project directory:
47
+
48
+ ```bash
49
+ git clone https://github.com/cartesinus/multiverb_iva_mt
50
+ cd multiverb_iva_mt
51
+ ```
52
+
53
+ Then:
54
+
55
+ ```python
56
+ import csv
57
+ from iva_mt.iva_mt import IVAMT
58
+ import pandas as pd
59
+
60
+ lang = "es"
61
+ translator = IVAMT(lang, peft_model_id="cartesinus/iva_mt_wslot-m2m100_418M-en-es-lora_adapter", device="cuda:0", batch_size=128)
62
+ trans = translator.translate("here your example")[0]
63
+ ```
64
+
65
+ ## Training results
66
+
67
+ | Epoch | Training Loss | Validation Loss | Bleu | Gen Len |
68
+ |:-----:|:-------------:|:---------------:|:-------:|:-------:|
69
+ | 1 | 7.8621 | 7.6870 | 24.9063 | 19.3322 |
70
+ | 2 | 7.6340 | 7.5312 | 29.7956 | 19.7533 |
71
+ | 3 | 7.5582 | 7.4595 | 34.8184 | 20.1269 |
72
+ | 4 | 7.5047 | 7.4264 | 36.1874 | 20.5621 |
73
+ | 5 | 7.4888 | 7.4167 | 36.2287 | 20.4417 |
74
+ | 6 | 7.4560 | 7.4013 | 36.6355 | 20.2241 |
75
+ | 7 | 7.4477 | 7.3907 | 37.0554 | 20.0945 |
76
+ | 8 | 7.4422 | 7.3743 | 37.7549 | 20.1589 |
77
+ | 9 | 7.4311 | 7.3748 | 37.5705 | 19.9370 |
78
+ | 10 | 7.4294 | 7.3679 | 37.5343 | 20.2241 |
79
+ | 11 | 7.4114 | 7.3697 | 38.1872 | 20.3836 |
80
+ | 12 | 7.4224 | 7.3620 | 38.1759 | 20.1785 |
81
+ | 13 | 7.4334 | 7.3608 | 38.0895 | 20.2996 |
82
+ | 14 | 7.4133 | 7.3621 | 38.2365 | 20.2948 |
83
+ | 15 | 7.4158 | 7.3599 | 38.1056 | 20.2010 |
84
+
85
+
86
+ ## Framework versions
87
+
88
+ - PEFT 0.5.0