--- license: apache-2.0 library_name: transformers base_model: Salesforce/codet5p-220m tags: - security - vulnerability-fix - code-repair - code-generation - codet5 - owasp - cwe language: - en - code pipeline_tag: text2text-generation datasets: - ayshajavd/code-security-vulnerability-dataset model-index: - name: codet5p-vuln-fixer results: - task: type: text2text-generation name: Vulnerability Fix Generation dataset: type: ayshajavd/code-security-vulnerability-dataset name: Code Security Vulnerability Dataset split: test metrics: - type: bleu value: 81.0 name: BLEU - type: rouge value: 0.788 name: ROUGE-L - type: rouge value: 0.802 name: ROUGE-1 - type: rouge value: 0.745 name: ROUGE-2 --- # CodeT5+ Vulnerability Fixer A code repair model that generates **secure fixes** for vulnerable code. Given vulnerable code + CWE type + programming language, it produces the patched version. Fine-tuned from [Salesforce/codet5p-220m](https://huggingface.co/Salesforce/codet5p-220m) (220M parameters) on 7,374 vulnerable→fixed code pairs. ## Quick Start ```python from transformers import AutoTokenizer, T5ForConditionalGeneration model_id = "ayshajavd/codet5p-vuln-fixer" tokenizer = AutoTokenizer.from_pretrained(model_id) model = T5ForConditionalGeneration.from_pretrained(model_id) model.eval() # CWE-aware input format code = """ def get_user(username): query = f"SELECT * FROM users WHERE username = '{username}'" conn = sqlite3.connect('db.sqlite') return conn.execute(query).fetchone() """ input_text = f"fix SQL Injection vulnerability in python: {code}" inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True) import torch with torch.no_grad(): outputs = model.generate( **inputs, max_length=512, num_beams=5, early_stopping=True, no_repeat_ngram_size=3, ) fixed_code = tokenizer.decode(outputs[0], skip_special_tokens=True) print(fixed_code) ``` ## Model Details | Property | Value | |----------|-------| | **Architecture** | T5ForConditionalGeneration (encoder-decoder, 8 layers each) | | **Base Model** | [Salesforce/codet5p-220m](https://huggingface.co/Salesforce/codet5p-220m) | | **Parameters** | 222,882,048 (222M) | | **Task** | Seq2Seq code repair (vulnerable → fixed) | | **Input Format** | `fix vulnerability in : ` | | **Max Sequence Length** | 512 tokens (input and output) | | **Generation** | Beam search (num_beams=5) | ## Evaluation Results (Test Set — 941 samples) | Metric | Score | |--------|-------| | **BLEU** | **81.0** | | **ROUGE-1** | **0.802** | | **ROUGE-2** | **0.745** | | **ROUGE-L** | **0.788** | | **Exact Match** | 1.4% | | **Eval Loss** | **0.175** | ### vs Previous Model (flan-t5-small) | | Old (v1) | New (v2) | Improvement | |---|---|---|---| | Base model | flan-t5-small (60M) | CodeT5+ 220M | 3.7x larger | | Eval loss | 0.547 | **0.175** | 3.1x better | | CWE-aware input | ❌ | ✅ | Context about vulnerability type | | BLEU evaluation | ❌ | **81.0** | Proper code similarity metric | ## Supported Languages Python, JavaScript, Java, C, C++, PHP, Go, Ruby The model was trained on a diverse multi-language dataset. Performance is strongest on C/C++ (largest training subset from BigVul). ## Training Details | Parameter | Value | |-----------|-------| | Learning Rate | 1e-4 (constant schedule) | | Effective Batch Size | 32 (8/device × 2 GPUs × 2 grad_accum) | | Epochs | 6 (early stopped at epoch 3 best) | | Best Epoch | 3 (eval_loss=0.1752) | | Precision | fp16 | | Gradient Checkpointing | Enabled | | Early Stopping | Patience=3 | | Optimizer | AdamW | | Hardware | 2× NVIDIA T4 16GB (Kaggle) | ### Training Recipe References - **T5APR** (arxiv:2309.15742): lr=1e-4, constant scheduler — Optuna-validated for CodeT5 code repair - **MultiMend** (arxiv:2501.16044): Same config, validated on 6 benchmarks ## Training Data Trained on the [code-security-vulnerability-dataset](https://huggingface.co/datasets/ayshajavd/code-security-vulnerability-dataset): - **7,374 training** samples (vulnerable code with fixes) - **994 validation** samples - **941 test** samples Filtered from 175K total samples to only include vulnerable samples with meaningful code fixes (>10 characters). ## Input Format The model uses a CWE-aware input format that tells it *what* vulnerability to fix: ``` fix vulnerability in : ``` Examples: - `fix SQL Injection vulnerability in python: ` - `fix Buffer Overflow vulnerability in c: ` - `fix Cross-Site Scripting vulnerability in javascript: ` ## Limitations 1. **512 token limit**: Long functions are truncated — fix quality degrades for very long code 2. **Formatting**: Generated fixes may lose original indentation/formatting 3. **Rare CWEs**: Performance is lower on vulnerability types with few training examples 4. **Not a replacement**: Should complement manual code review and established SAST tools 5. **Language bias**: Strongest on C/C++ (largest training subset) ## Interactive Demo Try the model in our [Code Security Analyzer Space](https://huggingface.co/spaces/ayshajavd/code-security-analyzer) — paste any code and get vulnerability detection + fix suggestions. ## Citation ```bibtex @misc{codet5p-vuln-fixer, title={CodeT5+ Vulnerability Fixer: CWE-Aware Code Repair with Seq2Seq Generation}, author={ayshajavd}, year={2025}, url={https://huggingface.co/ayshajavd/codet5p-vuln-fixer} } ```