--- license: mit tags: - face-recognition - arcface - metric-learning - resnet - pytorch - biometrics - attendance-system - computer-vision - embeddings library_name: pytorch pipeline_tag: image-feature-extraction language: - en --- # ๐Ÿง  ArcFace Industrial Face Recognition > **ResNet-50 backbone trained with ArcFace loss on VGGFace2 โ€” production-grade face embeddings for real-world identity verification.** This model is part of a broader comparative study of deep face recognition loss functions (ArcFace, SphereFace, Triplet Loss). After systematic evaluation across three experimental rounds, **ArcFace was selected as the production model** based on its training stability, numerical reliability at scale, and consistent generalization gains as data grows. --- ## ๐Ÿ“Š Model Performance | Metric | Value | |---|---| | Training Accuracy | **99%** | | Validation Accuracy | **85%** | | Training Loss | **0.03** | | Validation Loss | **4.00** | | Training Identities | 3,000 | | Images per Identity | ~200 | | Epochs | 100 | > Validation loss reflects expected open-set generalization behavior โ€” the model is trained on a closed identity set and evaluated against unseen faces. This gap narrows with more training data. --- ## ๐Ÿ—๏ธ Architecture ``` Input Image (112ร—112ร—3) โ”‚ โ–ผ ResNet-50 Backbone โ”‚ โ–ผ 512-D L2-Normalized Embedding โ”‚ โ–ผ ArcFace Head (Additive Angular Margin, m=0.5, s=64) ``` | Component | Details | |---|---| | Backbone | ResNet-50 | | Embedding Dimension | 512 | | Loss Function | ArcFace (m=0.5, s=64) | | Input Resolution | 112 ร— 112 | | Embedding Normalization | L2 | | Optimizer | SGD with cosine decay | | Learning Rate | 0.1 | --- ## ๐Ÿš€ Usage ### Load the Model ```python import torch from src.config import load_config from src.models.face_model import build_face_model device = torch.device("cuda" if torch.cuda.is_available() else "cpu") cfg = load_config("configs/base.yaml") model = build_face_model(cfg, num_classes=None) # inference mode ckpt = torch.load("model.pt", map_location=device) state_dict = ckpt.get("model_state", ckpt) backbone_state = { k.replace("backbone.", "", 1): v for k, v in state_dict.items() if k.startswith("backbone.") } model.backbone.load_state_dict(backbone_state, strict=False) model.eval().to(device) ``` ### Extract Embeddings ```python import cv2 import numpy as np from src.data.preprocessing import PreprocessingPipeline pipeline = PreprocessingPipeline( preproc_cfg=cfg.preprocessing, image_size=cfg.data.image_size, apply_detection=False, ) @torch.no_grad() def get_embedding(img_bgr: np.ndarray) -> np.ndarray: img_rgb = cv2.cvtColor(img_bgr, cv2.COLOR_BGR2RGB) chw = pipeline(img_rgb) tensor = torch.from_numpy(chw).unsqueeze(0).to(device) emb = model(tensor) return emb.squeeze().cpu().numpy().astype(np.float32) img = cv2.imread("face.jpg") embedding = get_embedding(img) # shape: (512,), L2-normalized ``` ### Compute Similarity ```python from numpy.linalg import norm def cosine_similarity(emb1: np.ndarray, emb2: np.ndarray) -> float: return float(np.dot(emb1, emb2) / (norm(emb1) * norm(emb2))) sim = cosine_similarity(embedding_a, embedding_b) # sim โˆˆ [-1, 1] โ€” higher = more similar # Typical threshold for same-person: โ‰ฅ 0.5 ``` --- ## ๐Ÿ—„๏ธ Production Pipeline This model powers a complete two-stage attendance system: ### Stage 1 โ€” Database Population (`build_database.py`) Registers known identities by computing gallery embeddings and storing them in **ChromaDB** (vector similarity search) linked to identity metadata in **MongoDB**. Runs a built-in Top-K evaluation on held-out probe images after registration. ### Stage 2 โ€” Real-Time Inference (`realtime_attendance.py`) Reads a live webcam feed, detects faces with **MTCNN**, embeds each crop through this model, and queries ChromaDB for the nearest registered identity. Recognized faces are labeled with name and similarity score; unknown faces are flagged. A cooldown timer prevents duplicate attendance logs. --- ## ๐Ÿ“ˆ Gallery Evaluation Results Evaluated on 50 registered identities (250 held-out probe images, never seen during training or registration): | Metric | Result | |---|---| | Top-1 Accuracy | **92.00%** (230 / 250) | | Top-3 Accuracy | **96.00%** (240 / 250) | | Top-5 Accuracy | **96.80%** (242 / 250) | | Failed reads | 0 / 250 | > Probe images are drawn from the same VGGFace2 distribution as the gallery but are a completely separate split โ€” never used during model training or gallery registration. --- ## ๐Ÿ”ฌ Why ArcFace for Production ArcFace was selected over SphereFace (the Round 3 LFW leader) based on engineering considerations critical for deployment: - **Additive angular margin** has a direct geometric interpretation on the hypersphere โ€” the decision boundary is fixed and predictable, making threshold calibration reliable across unseen identities. - **Numerical stability at scale** โ€” SphereFace's multiplicative margin becomes sensitive as class count increases. ArcFace's formulation remains stable regardless. - **Consistent data scaling** โ€” validation accuracy improved monotonically from 83% (1,000 identities) to 85% (3,000 identities), confirming predictable generalization gains as the training set grows. - **Industry standard** โ€” ArcFace is the de facto choice in production face recognition systems, with extensive tooling for quantization, ONNX export, and edge deployment. --- ## ๐Ÿ“ฆ Training Data | Dataset | Identities | Images | Resolution | |---|---|---|---| | VGGFace2 (subset) | 3,000 | ~600,000 | 112 ร— 112 | Full dataset: [VGGFace2 112ร—112 on Kaggle](https://www.kaggle.com/datasets/yakhyokhuja/vggface2-112x112) --- ## ๐Ÿ”— Related Resources | Resource | Link | |---|---| | ๐Ÿ““ Training & Evaluation Notebook | [Kaggle โ€” ArcFace Training](https://www.kaggle.com/code/abdelrhmansaadidrees/2-facerecognition-arcface/edit) | | ๐Ÿ“„ ArcFace Paper | [arXiv:1801.07698](https://arxiv.org/abs/1801.07698) | | ๐Ÿค— Triplet Loss Model | [AbdoSaad24/TripletLossModels](https://huggingface.co/AbdoSaad24/TripletLossModels) | | ๐Ÿค— SphereFace Model | [AbdoSaad24/BestSphereFaceModel](https://huggingface.co/AbdoSaad24/BestSphereFaceModel) | | ๐Ÿค— ArcFace (Research, R3) | [AbdoSaad24/BestArcFaceModel](https://huggingface.co/AbdoSaad24/BestArcFaceModel) | --- ## ๐Ÿ“‹ Citation ```bibtex @inproceedings{deng2019arcface, title = {ArcFace: Additive Angular Margin Loss for Deep Face Recognition}, author = {Deng, Jiankang and Guo, Jia and Xue, Niannan and Zafeiriou, Stefanos}, booktitle = {CVPR}, year = {2019} } ``` --- ## โš ๏ธ Limitations & Responsible Use - This model was trained on a subset of VGGFace2. Performance may degrade on faces from demographics underrepresented in the training data. - The model is intended for **attendance and access control systems** where subjects have consented to enrollment. - Do not use for surveillance, tracking, or identification of individuals without explicit consent. - Threshold selection (default: 0.5 cosine similarity) should be calibrated to your deployment environment โ€” lower thresholds increase false acceptances, higher thresholds increase false rejections. --- *Part of the [Face Recognition Comparative Study](https://github.com/AbdoSaad24) โ€” ArcFace ยท SphereFace ยท Triplet Loss.*