--- base_model: nvidia/Llama-3.1-Nemotron-Safety-Guard-8B-v3 language: - en - hi license: other tags: - content-moderation - safety - lora - peft - hindi - english datasets: - nvidia/Nemotron-Safety-Guard-Dataset-v3 --- # Nemotron Safety Guard — Hindi + English QLoRA fine-tune of [nvidia/Llama-3.1-Nemotron-Safety-Guard-8B-v3](https://huggingface.co/nvidia/Llama-3.1-Nemotron-Safety-Guard-8B-v3) for Hindi and English content safety classification. ## what this is the original model supports 9 languages. this fine-tune specializes it for hindi (`hi`) and english (`en`) only, trained on a balanced sample from the Nemotron-Safety-Guard-Dataset-v3. ## training details | | | |---|---| | base model | nvidia/Llama-3.1-Nemotron-Safety-Guard-8B-v3 | | method | QLoRA (4-bit, nf4) | | lora rank | 8 | | lora alpha | 32 | | target modules | q_proj, v_proj | | trainable params | 3.4M (LoRA adapters, 4-bit compressed) | | languages | English, Hindi | | training samples | 1000 (balanced) | | epochs | 1 | | learning rate | 2e-4 | ## how to use ```python from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel import torch, json base_model_id = "nvidia/Llama-3.1-Nemotron-Safety-Guard-8B-v3" adapter_id = "Amaanaliii/nemotron-safety-guard-hi-en" tokenizer = AutoTokenizer.from_pretrained(adapter_id) model = AutoModelForCausalLM.from_pretrained(base_model_id, torch_dtype=torch.float16, device_map="auto") model = PeftModel.from_pretrained(model, adapter_id) model.eval() ``` ## output format ```json {"User Safety": "safe" | "unsafe", "Response Safety": "safe" | "unsafe", "Safety Categories": "Violence, ..."} ``` `Response Safety` and `Safety Categories` are omitted when not applicable.