---
language: en
license: mit
library_name: pytorch
task_categories:
  - multimodal-classification
tags:
  - emotion-recognition
  - multimodal
  - early-fusion
  - text
  - audio
  - meld
---

# Early Fusion Emotion Recognition on MELD (Text + Audio)

This repository contains an **early fusion multimodal emotion recognition model** trained on the **MELD dataset**, combining **textual and acoustic embeddings** at the feature level.

Both modalities are encoded independently and fused via **embedding concatenation**
before classification.

---

##  Model Overview

- **Text encoder:** `bert-base-uncased` (frozen)
- **Audio encoder:** pre-extracted acoustic features (frozen)
- **Fusion strategy:** Early fusion (concatenation)
- **Classifier:** 2-layer MLP
- **Training strategy:**  
  - Both encoders are **frozen**
  - Only the fusion classifier is trained

---
---

##  Dataset

- **Name:** MELD (declare-lab/MELD)
- **Modalities:** Text + Audio
- **Setting:** Multi-class emotion classification
- **Splits:** Train / Validation / Test (official MELD splits)

---

##  Training Details

- **Loss:** Cross-entropy
- **Optimizer:** Adam
- **Fusion dimension:** 1536
- **Evaluation metrics:**
  - Accuracy
  - Macro F1-score
  - Per-class F1-score

---

##  Important Notes

- This model does **not perform end-to-end multimodal fine-tuning**.
- Both text and audio encoders act as **frozen feature extractors**.
- The provided weights correspond **only to the fusion classifier**.

To reproduce results, identical feature extraction pipelines must be used for both modalities.

---

##  Intended Use

- Multimodal emotion recognition research
- Comparison with unimodal baselines
- Early vs. late fusion analysis
- Educational and academic purposes

---

##  Limitations

- Temporal context across utterances is not modeled
- Speaker identity is not used
- Fusion is limited to simple feature concatenation
- Performance depends on quality of pre-extracted features