### 📝 Overview:
This is the LoRA-finetuned version of the original model on SafeSciTrain to enhance the safety. This model support standard (text) behaviors and contextual behaviors.

📚 To test the model, please use `vllm>=0.11.1`. Code to use the model can be found in our [github](https://github.com/yangyangyang127/SafeSci) 💻


### 💬📊 Performances

Performance on **Safety Knowledge** questions. We present the accuracy of seven fields and their average.

|             | Chem. | Bio. | Med. | Mat. | Eng. | Phy. | Psy. | Avg. |
| ----------- | ----- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
| Qwen3-8B    | 0.52  | 0.56 | 0.56 | 0.68 | 0.68 | 0.67 | 0.68 | 0.59 |
| *+ LoRA*    | 0.77  | 0.42 | 0.84 | 0.77 | 0.63 | 0.53 | 0.69 | 0.70 |
| Qwen3-14B   | 0.56  | 0.65 | 0.53 | 0.75 | 0.66 | 0.59 | 0.66 | 0.60 |
| *+ LoRA*    | 0.84  | 0.45 | 0.88 | 0.86 | 0.70 | 0.55 | 0.71 | 0.75 |
| Llama3.1-8B | 0.46  | 0.75 | 0.57 | 0.66 | 0.53 | 0.56 | 0.62 | 0.57 |
| *+ LoRA*    | 0.79  | 0.42 | 0.81 | 0.72 | 0.53 | 0.56 | 0.68 | 0.66 |


Safety rate on **Safety Risk** questions. We present the rejection rate of seven fields and their average.

|             | Chem. | Bio. | Med. | Mat. | Eng. | Phy. | Psy. | Avg. |
| ----------- | ----- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
| Qwen3-8B    | 0.37  | 0.41 | 0.21 | 0.52 | 0.16 | 0.23 | 0.14 | 0.31 |
| *+ LoRA*    | 0.83  | 0.95 | 0.94 | 0.85 | 0.19 | 0.28 | 0.08 | 0.64 |
| Qwen3-14B   | 0.31  | 0.37 | 0.15 | 0.39 | 0.14 | 0.16 | 0.11 | 0.26 |
| *+ LoRA*    | 0.76  | 0.90 | 0.53 | 0.94 | 0.26 | 0.53 | 0.14 | 0.60 |
| Llama3.1-8B | 0.49  | 0.55 | 0.69 | 0.87 | 0.27 | 0.33 | 0.20 | 0.41 |
| *+ LoRA*    | 0.94  | 0.86 | 0.97 | 0.99 | 0.32 | 0.29 | 0.21 | 0.58 |


### 📖 Citation:

```
@misc{zhu2026safescisafetyevaluationlarge,
      title={SafeSci: Safety Evaluation of Large Language Models in Science Domains and Beyond}, 
      author={Xiangyang Zhu and Yuan Tian and Qi Jia and Kaiwei Zhang and Zicheng Zhang and Chunyi Li and Kaiyuan Ji and Dongrui Liu and Zijian Chen and Lu Sun and Renrui Zhang and Yan Teng and Jing Shao and Wei Sun and Xia Hu and Yu Qiao and Guangtao Zhai},
      year={2026},
      eprint={2603.01589},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2603.01589}, 
}
```