Instructions to use LaurenGurgiolo/vit-micro-facial-expressions with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use LaurenGurgiolo/vit-micro-facial-expressions with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-classification", model="LaurenGurgiolo/vit-micro-facial-expressions") pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png")# Load model directly from transformers import AutoImageProcessor, AutoModelForImageClassification processor = AutoImageProcessor.from_pretrained("LaurenGurgiolo/vit-micro-facial-expressions") model = AutoModelForImageClassification.from_pretrained("LaurenGurgiolo/vit-micro-facial-expressions") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,134 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: afl-3.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: afl-3.0
|
| 3 |
+
datasets:
|
| 4 |
+
- LaurenGurgiolo/Micro_Facial_Expressions
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
+
metrics:
|
| 8 |
+
- accuracy
|
| 9 |
+
base_model:
|
| 10 |
+
- mo-thecreator/vit-Facial-Expression-Recognition
|
| 11 |
+
library_name: transformers
|
| 12 |
+
tags:
|
| 13 |
+
- facial
|
| 14 |
+
- emotion
|
| 15 |
+
- recognition
|
| 16 |
+
---
|
| 17 |
+
"
|
| 18 |
+
π ViT Micro Facial Expression Recognition Model
|
| 19 |
+
|
| 20 |
+
This repository contains a Vision Transformer (ViT)βbased facial emotion recognition model, iteratively fine-tuned for micro- and macro-facial expression classification.
|
| 21 |
+
|
| 22 |
+
The model is adapted from mo-thecreator/vit-Facial-Expression-Recognition, a Vision Transformer pre-trained and fine-tuned for facial emotion recognition, and further optimized using a combined dataset of micro-expressions and FER-2013 facial expression data.
|
| 23 |
+
|
| 24 |
+
π Model Details
|
| 25 |
+
|
| 26 |
+
Base model: mo-thecreator/vit-Facial-Expression-Recognition
|
| 27 |
+
|
| 28 |
+
Architecture: Vision Transformer (ViT)
|
| 29 |
+
|
| 30 |
+
Task: Facial Emotion Classification
|
| 31 |
+
|
| 32 |
+
Output classes: 7 emotion categories
|
| 33 |
+
|
| 34 |
+
Final model name: vit-micro-facial-expressions
|
| 35 |
+
|
| 36 |
+
π Emotion Classes
|
| 37 |
+
|
| 38 |
+
Each facial image is classified into one of the following seven emotion categories:
|
| 39 |
+
|
| 40 |
+
Label Emotion
|
| 41 |
+
0 Angry
|
| 42 |
+
1 Disgust
|
| 43 |
+
2 Fear
|
| 44 |
+
3 Happy
|
| 45 |
+
4 Sad
|
| 46 |
+
5 Surprise
|
| 47 |
+
6 Neutral
|
| 48 |
+
π Dataset
|
| 49 |
+
|
| 50 |
+
The model was trained on a combined dataset consisting of:
|
| 51 |
+
|
| 52 |
+
Micro Facial Expressions dataset
|
| 53 |
+
|
| 54 |
+
FER-2013 dataset
|
| 55 |
+
|
| 56 |
+
Dataset Statistics
|
| 57 |
+
|
| 58 |
+
Training samples: 28,709 images
|
| 59 |
+
|
| 60 |
+
Public test samples: 3,589 images
|
| 61 |
+
|
| 62 |
+
The Micro Facial Expressions dataset is publicly available on Hugging Face:
|
| 63 |
+
π https://huggingface.co/datasets/LaurenGurgiolo/Micro_Facial_Expressions
|
| 64 |
+
|
| 65 |
+
π§ Training Methodology
|
| 66 |
+
|
| 67 |
+
The base ViT model was iteratively fine-tuned on the combined dataset.
|
| 68 |
+
|
| 69 |
+
Training focused on improving sensitivity to subtle micro-expressions while maintaining robustness on standard facial expressions.
|
| 70 |
+
|
| 71 |
+
Iterative fine-tuning enabled progressive refinement of feature representations across emotion classes.
|
| 72 |
+
|
| 73 |
+
Vision Transformers were selected due to their demonstrated superiority over convolutional neural networks (CNNs) in facial recognition tasks. Empirical studies show that ViTs outperform CNNs in both classification accuracy and generalization capability (Rodrigo et al., 2024).
|
| 74 |
+
|
| 75 |
+
π Performance
|
| 76 |
+
|
| 77 |
+
Accuracy on micro-expression test dataset: 88%
|
| 78 |
+
|
| 79 |
+
Evaluation metric: Classification Accuracy
|
| 80 |
+
|
| 81 |
+
This performance indicates strong generalization for subtle facial expression recognition, particularly in micro-expression scenarios.
|
| 82 |
+
|
| 83 |
+
πΌοΈ Training Results Snapshot
|
| 84 |
+
|
| 85 |
+
Figure 1: Micro-Expressions ViT Model
|
| 86 |
+
(Training and evaluation metrics illustrating convergence and performance improvements across epochs.)
|
| 87 |
+
|
| 88 |
+
π Usage Example
|
| 89 |
+
from transformers import AutoImageProcessor, AutoModelForImageClassification
|
| 90 |
+
import torch
|
| 91 |
+
from PIL import Image
|
| 92 |
+
|
| 93 |
+
processor = AutoImageProcessor.from_pretrained("./vit-micro-facial-expressions")
|
| 94 |
+
model = AutoModelForImageClassification.from_pretrained("./vit-micro-facial-expressions")
|
| 95 |
+
|
| 96 |
+
image = Image.open("face.jpg")
|
| 97 |
+
inputs = processor(images=image, return_tensors="pt")
|
| 98 |
+
|
| 99 |
+
with torch.no_grad():
|
| 100 |
+
outputs = model(**inputs)
|
| 101 |
+
|
| 102 |
+
predicted_class = outputs.logits.argmax(dim=-1).item()
|
| 103 |
+
print(predicted_class)
|
| 104 |
+
|
| 105 |
+
β οΈ Limitations
|
| 106 |
+
|
| 107 |
+
Performance may degrade on:
|
| 108 |
+
|
| 109 |
+
Low-resolution or heavily occluded faces
|
| 110 |
+
|
| 111 |
+
Extreme head poses or lighting conditions
|
| 112 |
+
|
| 113 |
+
Emotion labels are inherently subjective and dataset-dependent.
|
| 114 |
+
|
| 115 |
+
The model is optimized for facial images and may not generalize to non-face imagery.
|
| 116 |
+
|
| 117 |
+
π License & Attribution
|
| 118 |
+
|
| 119 |
+
Base Model: mo-thecreator/vit-Facial-Expression-Recognition
|
| 120 |
+
|
| 121 |
+
Datasets: FER-2013 and Micro Facial Expressions dataset licenses apply
|
| 122 |
+
|
| 123 |
+
Please review the respective Hugging Face dataset and model licenses before commercial use.
|
| 124 |
+
|
| 125 |
+
π Acknowledgements
|
| 126 |
+
|
| 127 |
+
Hugging Face for model hosting and datasets
|
| 128 |
+
|
| 129 |
+
FER-2013 contributors
|
| 130 |
+
|
| 131 |
+
Micro Facial Expressions dataset authors
|
| 132 |
+
|
| 133 |
+
Prior research demonstrating ViT effectiveness in facial emotion recognition
|
| 134 |
+
"
|