Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs
Paper • 2605.30501 • Published • 29
To Join use your KCL email! Machine Learning, Natural Language Processing, Gaussian Processes, Human-Computer Interaction
EDIT: Evidence-Diagnosed Intervention Training for Rule-Faithful LLM Grading
Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs