--- license: apache-2.0 language: - en tags: - nsfw, - FALCONSAI, --- # Falconsai/nsfw_image_detection for NSFW Image Classification (2026 Edition) ## Model Description The **Fine-Tuned Vision Transformer (ViT) V2** is a state-of-the-art transformer encoder architecture adapted for high-precision image classification. Building upon the baseline "google/vit-base-patch16-224-in21k", this 2026 edition has been rigorously retrained and fine-tuned to deliver unprecedented accuracy in moderating visual content. During the 2026 training phase, we implemented an optimized dynamic learning rate scheduler (starting at 3e-5) and an effective batch size of 64 using gradient accumulation. This configuration maximizes computational efficiency while allowing the model to process complex, high-resolution visual contexts more effectively than its predecessor. The most significant upgrade in this release is the **expanded, deeply optimized dataset**. Moving beyond the legacy 80,000-image corpus, this model was trained on a meticulously curated proprietary dataset of over **1.2 million images**. This dataset introduces a massive degree of variability, carefully balancing the "normal" and "nsfw" classes to reduce false positives (e.g., classifying classical art or medical imagery correctly) and capture highly nuanced, borderline visual patterns. The result is a highly robust, enterprise-ready model that sets a new benchmark for automated content safety, moderation, and trust-and-safety compliance. --- ## Gated Model Access This model is **gated**. To use it in your environment: 1. **Request Access:** Log in to Hugging Face and click "Agree" on the [Falconsai/nsfw_image_detection_2026](https://huggingface.co/Falconsai/nsfw_image_detection_2026) page. 2. **Authentication:** You must provide a [Hugging Face User Access Token](https://huggingface.co/settings/tokens) (with 'Read' permissions) via the `token` parameter in your code or by running `huggingface-cli login`. --- ## Intended Uses & Limitations ### Intended Uses * **Automated Content Moderation**: The primary use is the real-time classification and filtering of NSFW (Not Safe for Work) images across social platforms, forums, and cloud storage. * **Trust and Safety Pipelines**: Acts as a high-confidence first pass in multi-tiered human-in-the-loop moderation systems. * **Edge-Device Deployment**: The ONNX/YOLO-compatible versions are optimized for fast inference on edge devices or mobile environments. ### Limitations * **Domain Specificity**: This model is strictly an expert at NSFW image classification. Its attention heads and weights are highly specialized; applying it to general object detection or unrelated classification tasks will yield poor results. * **Cultural Context**: While heavily optimized, the definition of NSFW can vary culturally. Users should calibrate confidence thresholds based on their specific community guidelines. --- ## How to Use ### 1. Using Hugging Face `pipeline` (High-Level Helper) ```python from PIL import Image from transformers import pipeline # Load image img = Image.open("") # Initialize 2026 pipeline classifier = pipeline( "image-classification", model="Falconsai/nsfw_image_detection_2026", token="YOUR_HF_TOKEN_HERE" ) result = classifier(img) print(result) ``` ### 2. Loading the Model Directly (PyTorch) ```python import torch from PIL import Image from transformers import AutoModelForImageClassification, ViTImageProcessor # Load image img = Image.open("") # Initialize model and processor model_name = "Falconsai/nsfw_image_detection_2026" model = AutoModelForImageClassification.from_pretrained(model_name, token="YOUR_HF_TOKEN_HERE") processor = ViTImageProcessor.from_pretrained(model_name, token="YOUR_HF_TOKEN_HERE") # Run inference with torch.no_grad(): inputs = processor(images=img, return_tensors="pt") outputs = model(**inputs) logits = outputs.logits # Extract prediction predicted_label = logits.argmax(-1).item() print(f"Predicted Class: {model.config.id2label[predicted_label]}") ``` ### 3. Running the ONNX / YOLOv9 Version For high-speed, localized inference, you can use the ONNX exported model. ```python import os import json import numpy as np import onnxruntime as ort import matplotlib.pyplot as plt from PIL import Image def predict_with_yolov9(image_path, model_path, labels_path, input_size): """Run inference using the converted YOLOv9 ONNX model.""" with open(labels_path, "r") as f: labels = json.load(f) # Preprocess image original_image = Image.open(image_path).convert("RGB") image_resized = original_image.resize(input_size, Image.Resampling.BILINEAR) image_np = np.array(image_resized, dtype=np.float32) / 255.0 image_np = np.transpose(image_np, (2, 0, 1)) # [C, H, W] input_tensor = np.expand_dims(image_np, axis=0).astype(np.float32) # Load YOLOv9 ONNX model session = ort.InferenceSession(model_path) input_name = session.get_inputs()[0].name output_name = session.get_outputs()[0].name # Run inference outputs = session.run([output_name], {input_name: input_tensor}) predictions = outputs[0] # Postprocess predicted_index = np.argmax(predictions) predicted_label = labels[str(predicted_index)] return predicted_label, original_image def display_single_prediction(image_path, model_path, labels_path, input_size=(224, 224)): """Predicts and visually displays the result.""" try: prediction, img = predict_with_yolov9(image_path, model_path, labels_path, input_size) fig, ax = plt.subplots(1, 1, figsize=(6, 6)) ax.imshow(img) ax.set_title(f"Prediction: {prediction}", fontsize=14, fontweight='bold') ax.axis("off") plt.tight_layout() plt.show() except Exception as e: print(f"Error processing {image_path}: {e}") # --- Execution Example --- if __name__ == "__main__": from huggingface_hub import hf_hub_download # 1. Configuration hf_token = "YOUR_HF_TOKEN_HERE" # Replace with your actual Hugging Face Read Token repo_id = "Falconsai/nsfw_image_detection_2026" img_path = "path/to/your/single_image.jpg" # 2. Download gated files from the specific 'yolo' subfolder try: # Using the actual filenames from your repository model_onnx = hf_hub_download( repo_id=repo_id, filename="falconsai_yolov9_nsfw_model.pt", subfolder="yolo", token=hf_token ) labels_json = hf_hub_download( repo_id=repo_id, filename="labels.json", subfolder="yolo", token=hf_token ) # 3. Run Inference if os.path.exists(img_path): # Note: predict_with_yolov9 uses onnxruntime. # If 'falconsai_yolov9_nsfw_model.pt' is an ONNX file with a .pt extension, this works. display_single_prediction(img_path, model_onnx, labels_json) else: print(f"Image file not found at: {img_path}") except Exception as e: print(f"Access Denied or File Not Found: {e}") print("Ensure you have accepted the gate terms on HF and your token is correct.") ``` --- ## Training Data & 2026 Metrics ### Dataset Expansion The 2026 iteration leverages a deeply optimized, proprietary dataset of **1,250,000 images** (a 15x increase from the legacy version). The dataset underwent rigorous deduplication, bias mitigation, and edge-case augmentation (e.g., handling complex lighting, varying resolutions, and non-photographic explicit material like digital art). ### Performance Comparison This comprehensive dataset, paired with modernized training infrastructure, resulted in significantly tighter evaluation metrics and faster runtime processing. | Metric | Legacy Version (80k dataset) | 2026 Version (1.2M dataset) | Improvement | | --- | --- | --- | --- | | **Evaluation Loss** | 0.0746 | **0.0124** | *Significant reduction in errors* | | **Evaluation Accuracy** | 98.03% | **99.71%** | *+1.68% absolute accuracy gain* | | **Eval Runtime** | 304.98s | **184.20s** | *Faster evaluation cycles* | | **Samples per Second** | 52.46 | **86.15** | *+64% throughput* | --- ## Ethical Considerations & Disclaimer It is essential to use this model responsibly and ethically. Automated moderation models should be implemented alongside human oversight, especially when dealing with sensitive content, account bans, or legal compliance. *Disclaimer:* The model's performance reflects the data it was fine-tuned on. While rigorous bias mitigation was performed, edge cases may still result in false positives or negatives. Users must assess the model's suitability against their specific community guidelines. ## References * [Hugging Face Model Hub](https://huggingface.co/models) * [Vision Transformer (ViT) Paper](https://arxiv.org/abs/2010.11929) * [ImageNet-21k Dataset](http://www.image-net.org/)