--- base_model: - mzhaoshuai/zephyr-7b-alpha-conf-sft library_name: peft license: apache-2.0 datasets: - shuchangtao/CONQORD_dataset tags: - refalign --- # RefAlign: RL with Similarity-based Rewards **GitHub repository**: https://github.com/mzhaoshuai/RefAlign **Paper**: [Learning from Reference Answers: Versatile Language Model Alignment without Binary Human Preference Data](https://huggingface.co/papers/2504.09895). RefAlign Training with https://huggingface.co/datasets/shuchangtao/CONQORD_dataset/tree/main/conqord_step3_data. ## Framework versions - PEFT 0.11.1