ritessshhh commited on
Commit
381203b
·
verified ·
1 Parent(s): 425e58d

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +145 -59
README.md CHANGED
@@ -1,71 +1,157 @@
1
- # DeBERTa Multi-Label Financial Event Classifier (SENTiVENT)
2
-
3
- This model is a fine-tuned **DeBERTa** classifier that predicts one or more **financial event types** from a news headline or short piece of text.
4
-
5
- Given a headline, the model outputs probabilities for multiple event categories such as mergers, earnings reports, legal actions, investments, and more. It is designed for **multi-label classification**, meaning a single headline can belong to multiple event types at once.
6
-
 
 
 
 
 
 
 
 
 
 
 
7
  ---
8
 
9
- ## Event Labels
10
-
11
- The model predicts the following 18 event categories:
12
-
13
- - CSR/Brand
14
- - Deal
15
- - Dividend
16
- - Employment
17
- - Expense
18
- - Facility
19
- - FinancialReport
20
- - Financing
21
- - Investment
22
- - Legal
23
- - Macroeconomics
24
- - Merger/Acquisition
25
- - Product/Service
26
- - Profit/Loss
27
- - Rating
28
- - Revenue
29
- - SalesVolume
30
- - SecurityValue
31
 
32
- ---
33
 
34
- ## Intended Use
35
 
36
- This model is useful for:
 
 
 
37
 
38
- - Financial news analysis
39
- - Market event extraction
40
- - Trading signal pipelines
41
- - Knowledge graph population
42
- - Research in finance-focused NLP
43
 
44
- It works best on **short financial news headlines or sentences**.
45
 
46
- ---
 
 
 
 
 
47
 
48
- ## Model Details
 
 
49
 
50
- - Base model: `deberta-v3-large`
51
- - Task: Multi-label text classification
52
- - Activation: Sigmoid (per-label probabilities)
53
- - Loss: Binary Cross Entropy
54
- - Problem type: `multi_label_classification`
55
 
56
- The model configuration includes `id2label`, `label2id`, and the correct problem type so it works cleanly with Hugging Face pipelines.
 
 
57
 
58
- ---
59
- license: mit
60
- language:
61
- - en
62
- metrics:
63
- - f1
64
- - precision
65
- - recall
66
- base_model:
67
- - microsoft/deberta-v3-large
68
- pipeline_tag: text-classification
69
- tags:
70
- - finance
71
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ tags:
6
+ - text-classification
7
+ - multi-label-classification
8
+ - financial-nlp
9
+ - finance
10
+ - event-detection
11
+ datasets:
12
+ - sentivent
13
+ metrics:
14
+ - f1
15
+ - precision
16
+ - recall
17
+ pipeline_tag: text-classification
18
  ---
19
 
20
+ # FinDeBERTa: Multi-Label Financial Event Classifier
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
+ FinDeBERTa is a fine-tuned DeBERTa-v3-Large model for multi-label financial event classification. It predicts one or more event types from financial news headlines with state-of-the-art performance.
23
 
24
+ ## Model Details
25
 
26
+ - **Base Model**: microsoft/deberta-v3-large
27
+ - **Task**: Multi-label text classification
28
+ - **Training**: Fine-tuned with Focal Loss and per-class threshold optimization
29
+ - **Performance**: Macro F1: 0.686 | Precision: 0.731 | Recall: 0.685
30
 
31
+ ## Event Labels
 
 
 
 
32
 
33
+ The model classifies text into 18 financial event types:
34
 
35
+ ```python
36
+ ["CSR/Brand", "Deal", "Dividend", "Employment", "Expense", "Facility",
37
+ "FinancialReport", "Financing", "Investment", "Legal", "Macroeconomics",
38
+ "Merger/Acquisition", "Product/Service", "Profit/Loss", "Rating", "Revenue",
39
+ "SalesVolume", "SecurityValue"]
40
+ ```
41
 
42
+ ## Usage
43
+
44
+ ### Basic Usage (with default threshold)
45
 
46
+ ```python
47
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
48
+ import torch
49
+ import numpy as np
 
50
 
51
+ model_name = "ritessshhh/FinDeBERTa"
52
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
53
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
54
 
55
+ text = "Tesla to acquire a battery startup in a 400 million dollar deal."
56
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
57
+
58
+ with torch.no_grad():
59
+ outputs = model(**inputs)
60
+ probs = torch.sigmoid(outputs.logits)[0].cpu().numpy()
61
+
62
+ # Using default threshold of 0.5
63
+ threshold = 0.5
64
+ predictions = [
65
+ {"label": model.config.id2label[i], "score": float(prob)}
66
+ for i, prob in enumerate(probs) if prob >= threshold
67
+ ]
68
+
69
+ print(predictions)
70
+ ```
71
+
72
+ ### Optimized Usage (with per-class thresholds)
73
+
74
+ For best performance, use the per-class optimized thresholds:
75
+
76
+ ```python
77
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
78
+ import torch
79
+ import numpy as np
80
+ from huggingface_hub import hf_hub_download
81
+
82
+ model_name = "ritessshhh/FinDeBERTa"
83
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
84
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
85
+
86
+ # Download per-class thresholds
87
+ thresholds_path = hf_hub_download(repo_id=model_name, filename="thresholds.npy")
88
+ thresholds = np.load(thresholds_path)
89
+
90
+ text = "Tesla to acquire a battery startup in a 400 million dollar deal."
91
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
92
+
93
+ with torch.no_grad():
94
+ outputs = model(**inputs)
95
+ probs = torch.sigmoid(outputs.logits)[0].cpu().numpy()
96
+
97
+ # Apply per-class thresholds
98
+ predictions = [
99
+ {"label": model.config.id2label[i], "score": float(prob)}
100
+ for i, prob in enumerate(probs) if prob >= thresholds[i]
101
+ ]
102
+
103
+ # Sort by score
104
+ predictions = sorted(predictions, key=lambda x: x["score"], reverse=True)
105
+ print(predictions)
106
+ # Output: [{"label": "Merger/Acquisition", "score": 0.98}, ...]
107
+ ```
108
+
109
+ ## Training Details
110
+
111
+ - **Loss Function**: Focal Loss (gamma=2.0) with dampened class weights
112
+ - **Optimizer**: AdamW with cosine learning rate scheduling
113
+ - **Batch Size**: 8 (with gradient accumulation steps=2)
114
+ - **Epochs**: 10
115
+ - **Learning Rate**: 2e-5
116
+ - **Weight Decay**: 0.02
117
+
118
+ ## Performance
119
+
120
+ | Metric | Score |
121
+ |--------|-------|
122
+ | Macro F1 | 0.686 |
123
+ | Micro F1 | 0.710 |
124
+ | Precision (Macro) | 0.731 |
125
+ | Recall (Macro) | 0.685 |
126
+ | Exact Match Ratio | 0.569 |
127
+
128
+ ### Per-Class Performance
129
+
130
+ | Label | F1 Score |
131
+ |-------|----------|
132
+ | Merger/Acquisition | 0.896 |
133
+ | Dividend | 0.923 |
134
+ | Profit/Loss | 0.863 |
135
+ | Employment | 0.833 |
136
+ | Rating | 0.821 |
137
+ | SalesVolume | 0.820 |
138
+
139
+ ## Limitations
140
+
141
+ - Optimized for financial news headlines (short text)
142
+ - May not generalize well to other domains
143
+ - Performance varies by event type (rare events like "Facility" have lower F1)
144
+
145
+ ## Citation
146
+
147
+ If you use this model, please cite:
148
+
149
+ ```bibtex
150
+ @misc{findeberta2024,
151
+ author = {ritessshhh},
152
+ title = {FinDeBERTa: Multi-Label Financial Event Classifier},
153
+ year = {2024},
154
+ publisher = {HuggingFace},
155
+ howpublished = {\url{https://huggingface.co/ritessshhh/FinDeBERTa}}
156
+ }
157
+ ```