PEFT
File size: 2,933 Bytes
484f40a
0a65e4e
7525e5a
af36ae9
7525e5a
484f40a
0a65e4e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
099ce61
0a65e4e
099ce61
0a65e4e
 
d955ffe
0a65e4e
 
 
 
 
099ce61
 
 
 
 
7525e5a
 
099ce61
 
0a65e4e
7525e5a
 
0a65e4e
 
 
099ce61
 
7525e5a
 
 
 
 
099ce61
 
 
 
 
0a65e4e
 
 
 
 
 
 
 
 
 
 
 
099ce61
 
5b0196f
 
 
 
099ce61
0a65e4e
 
 
099ce61
2c854a3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
af36ae9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
---
base_model: WizardLMTeam/WizardCoder-Python-13B-V1.0
library_name: peft
license: apache-2.0
pipeline_tag: text-generation
---

# Model Card for Model ID

This is a model capable of injecting bugs into correct Python programs. It was used to inject bugs into correct programs to form the core of the MegaBugFix benchmark.


## Model Details

- **Developed by:** Balázs Szalontai
- **Model type:** Decoder-only Language Model
- **Language(s) (NLP):** None
- **License:** Apache license 2.0
- **Finetuned from model [optional]:** WizardLMTeam/WizardCoder-Python-13B-V1.0

## Uses

You may use the model in the following way:

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import re

model_id_pretrained = 'WizardLMTeam/WizardCoder-Python-13B-V1.0'
model_id_finetuned  = 'szalontaib/MegaBugInject'

tokenizer = AutoTokenizer.from_pretrained(model_id_pretrained, add_eos_token=False)
model = AutoModelForCausalLM.from_pretrained(model_id_pretrained, device_map='auto', dtype=torch.float16, trust_remote_code=True)
model = PeftModel.from_pretrained(model, model_id_finetuned)


def extract_diff(model_output):
    pattern = re.compile(r'\s*\[DIFF\](.*?)\[/DIFF\]\s*', re.DOTALL)
    matches = pattern.findall(model_output)
    if matches:
        return matches[0].strip('
')
    return None

def diff2code(diff : str) -> str:
    return '
'.join(
        line[2:] for line in diff.splitlines()
        if not line.startswith('-')
    ).strip()

def corrupt(program, model, tokenizer, **generation_kwargs):
    prompt = f'[PYTHON]
{program.strip()}
[/PYTHON]
[DIFF]
'
    model_inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
    generated_ids = model.generate(**model_inputs, **generation_kwargs)
    outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
    diffs = [extract_diff(output) for output in outputs]
    corrupted_programs = [diff2code(diff) for diff in diffs if diff is not None]
    return corrupted_programs


test_code = '''
def bitcount(n):
    count = 0
    while n:
        n &= n - 1
        count += 1
    return count
'''.strip()

corrupted_programs = corrupt(
    test_code, model, tokenizer, 
    do_sample=True, 
    temperature=0.5, 
    max_new_tokens=4096, 
    num_return_sequences=5,
)

for corrupted_program in corrupted_programs:
    print('-'*30)
    print(corrupted_program)
```

# Citation

If you use our benchmark or bug injection model, please cite our paper.

```
@misc{szalontai2026diffbasedcodecorruptionusing,
      title={Diff-Based Code Corruption using LLMs for Large-Scale Bugfix Benchmarking}, 
      author={Balázs Szalontai and Ábel Szauter and Balázs Márton and Péter Verebics and Balázs Pintér and Tibor Gregorics},
      year={2026},
      eprint={2606.29088},
      archivePrefix={arXiv},
      primaryClass={cs.SE},
      url={https://arxiv.org/abs/2606.29088}, 
}
```