ai-safety-institute 's Collections

Lie Detection

Did you lie? Evaluating Lie Detectors across Model Scale and Belief-Verified Model Organisms