---
license: mit
tags:
  - face-recognition
  - arcface
  - metric-learning
  - resnet
  - pytorch
  - biometrics
  - attendance-system
  - computer-vision
  - embeddings
library_name: pytorch
pipeline_tag: image-feature-extraction
language:
  - en
---

# 🧠 ArcFace Industrial Face Recognition

> **ResNet-50 backbone trained with ArcFace loss on VGGFace2 — production-grade face embeddings for real-world identity verification.**

This model is part of a broader comparative study of deep face recognition loss functions (ArcFace, SphereFace, Triplet Loss). After systematic evaluation across three experimental rounds, **ArcFace was selected as the production model** based on its training stability, numerical reliability at scale, and consistent generalization gains as data grows.

---

## 📊 Model Performance

| Metric | Value |
|---|---|
| Training Accuracy | **99%** |
| Validation Accuracy | **85%** |
| Training Loss | **0.03** |
| Validation Loss | **4.00** |
| Training Identities | 3,000 |
| Images per Identity | ~200 |
| Epochs | 100 |

> Validation loss reflects expected open-set generalization behavior — the model is trained on a closed identity set and evaluated against unseen faces. This gap narrows with more training data.

---

## 🏗️ Architecture

```
Input Image (112×112×3)
        │
        ▼
  ResNet-50 Backbone
        │
        ▼
  512-D L2-Normalized Embedding
        │
        ▼
   ArcFace Head
   (Additive Angular Margin, m=0.5, s=64)
```

| Component | Details |
|---|---|
| Backbone | ResNet-50 |
| Embedding Dimension | 512 |
| Loss Function | ArcFace (m=0.5, s=64) |
| Input Resolution | 112 × 112 |
| Embedding Normalization | L2 |
| Optimizer | SGD with cosine decay |
| Learning Rate | 0.1 |

---

## 🚀 Usage

### Load the Model

```python
import torch
from src.config import load_config
from src.models.face_model import build_face_model

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
cfg = load_config("configs/base.yaml")

model = build_face_model(cfg, num_classes=None)  # inference mode
ckpt = torch.load("model.pt", map_location=device)
state_dict = ckpt.get("model_state", ckpt)

backbone_state = {
    k.replace("backbone.", "", 1): v
    for k, v in state_dict.items()
    if k.startswith("backbone.")
}
model.backbone.load_state_dict(backbone_state, strict=False)
model.eval().to(device)
```

### Extract Embeddings

```python
import cv2
import numpy as np
from src.data.preprocessing import PreprocessingPipeline

pipeline = PreprocessingPipeline(
    preproc_cfg=cfg.preprocessing,
    image_size=cfg.data.image_size,
    apply_detection=False,
)

@torch.no_grad()
def get_embedding(img_bgr: np.ndarray) -> np.ndarray:
    img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB)
    chw = pipeline(img_rgb)
    tensor = torch.from_numpy(chw).unsqueeze(0).to(device)
    emb = model(tensor)
    return emb.squeeze().cpu().numpy().astype(np.float32)

img = cv2.imread("face.jpg")
embedding = get_embedding(img)  # shape: (512,), L2-normalized
```

### Compute Similarity

```python
from numpy.linalg import norm

def cosine_similarity(emb1: np.ndarray, emb2: np.ndarray) -> float:
    return float(np.dot(emb1, emb2) / (norm(emb1) * norm(emb2)))

sim = cosine_similarity(embedding_a, embedding_b)
# sim ∈ [-1, 1] — higher = more similar
# Typical threshold for same-person: ≥ 0.5
```

---

## 🗄️ Production Pipeline

This model powers a complete two-stage attendance system:

### Stage 1 — Database Population (`build_database.py`)

Registers known identities by computing gallery embeddings and storing them in **ChromaDB** (vector similarity search) linked to identity metadata in **MongoDB**. Runs a built-in Top-K evaluation on held-out probe images after registration.

### Stage 2 — Real-Time Inference (`realtime_attendance.py`)

Reads a live webcam feed, detects faces with **MTCNN**, embeds each crop through this model, and queries ChromaDB for the nearest registered identity. Recognized faces are labeled with name and similarity score; unknown faces are flagged. A cooldown timer prevents duplicate attendance logs.

---

## 📈 Gallery Evaluation Results

Evaluated on 50 registered identities (250 held-out probe images, never seen during training or registration):

| Metric | Result |
|---|---|
| Top-1 Accuracy | **92.00%** (230 / 250) |
| Top-3 Accuracy | **96.00%** (240 / 250) |
| Top-5 Accuracy | **96.80%** (242 / 250) |
| Failed reads | 0 / 250 |

> Probe images are drawn from the same VGGFace2 distribution as the gallery but are a completely separate split — never used during model training or gallery registration.

---

## 🔬 Why ArcFace for Production

ArcFace was selected over SphereFace (the Round 3 LFW leader) based on engineering considerations critical for deployment:

- **Additive angular margin** has a direct geometric interpretation on the hypersphere — the decision boundary is fixed and predictable, making threshold calibration reliable across unseen identities.
- **Numerical stability at scale** — SphereFace's multiplicative margin becomes sensitive as class count increases. ArcFace's formulation remains stable regardless.
- **Consistent data scaling** — validation accuracy improved monotonically from 83% (1,000 identities) to 85% (3,000 identities), confirming predictable generalization gains as the training set grows.
- **Industry standard** — ArcFace is the de facto choice in production face recognition systems, with extensive tooling for quantization, ONNX export, and edge deployment.

---

## 📦 Training Data

| Dataset | Identities | Images | Resolution |
|---|---|---|---|
| VGGFace2 (subset) | 3,000 | ~600,000 | 112 × 112 |

Full dataset: [VGGFace2 112×112 on Kaggle](https://www.kaggle.com/datasets/yakhyokhuja/vggface2-112x112)

---

## 🔗 Related Resources

| Resource | Link |
|---|---|
| 📓 Training & Evaluation Notebook | [Kaggle — ArcFace Training](https://www.kaggle.com/code/abdelrhmansaadidrees/2-facerecognition-arcface/edit) |
| 📄 ArcFace Paper | [arXiv:1801.07698](https://arxiv.org/abs/1801.07698) |
| 🤗 Triplet Loss Model | [AbdoSaad24/TripletLossModels](https://huggingface.co/AbdoSaad24/TripletLossModels) |
| 🤗 SphereFace Model | [AbdoSaad24/BestSphereFaceModel](https://huggingface.co/AbdoSaad24/BestSphereFaceModel) |
| 🤗 ArcFace (Research, R3) | [AbdoSaad24/BestArcFaceModel](https://huggingface.co/AbdoSaad24/BestArcFaceModel) |

---

## 📋 Citation

```bibtex
@inproceedings{deng2019arcface,
  title     = {ArcFace: Additive Angular Margin Loss for Deep Face Recognition},
  author    = {Deng, Jiankang and Guo, Jia and Xue, Niannan and Zafeiriou, Stefanos},
  booktitle = {CVPR},
  year      = {2019}
}
```

---

## ⚠️ Limitations & Responsible Use

- This model was trained on a subset of VGGFace2. Performance may degrade on faces from demographics underrepresented in the training data.
- The model is intended for **attendance and access control systems** where subjects have consented to enrollment.
- Do not use for surveillance, tracking, or identification of individuals without explicit consent.
- Threshold selection (default: 0.5 cosine similarity) should be calibrated to your deployment environment — lower thresholds increase false acceptances, higher thresholds increase false rejections.

---

*Part of the [Face Recognition Comparative Study](https://github.com/AbdoSaad24) — ArcFace · SphereFace · Triplet Loss.*