Did you lie? Evaluating Lie Detectors across Model Scale and Belief-Verified Model Organisms
AI & ML interests
AI Safety
Recent Activity
View all activity
models 589
ai-safety-institute/dyl-truthful-qwen-qwen3.5-4b
Updated
ai-safety-institute/dyl-truthful-qwen-qwen3.5-122b-a10b-fp8
Updated
ai-safety-institute/dyl-truthful-qwen-qwen3-14b
Updated
ai-safety-institute/dyl-zai-org-glm-5.1-fp8
Updated
ai-safety-institute/dyl-zai-org-glm-4.7-flash
Updated
ai-safety-institute/dyl-zai-org-glm-4.7-fp8
Updated
ai-safety-institute/dyl-zai-org-glm-4.5-air-fp8
Updated
ai-safety-institute/dyl-sandbagging-games-yorick
Updated
ai-safety-institute/dyl-sandbagging-games-tarun
Updated
ai-safety-institute/dyl-sandbagging-games-oak
Updated
datasets 38
ai-safety-institute/unrelated-questions-follow-up-questions
Viewer • Updated • 65 • 20
ai-safety-institute/lie-detection-rollouts
Viewer • Updated • 1.52M • 6.6k • 1
ai-safety-institute/eval_sandbagger_ood_eval
Viewer • Updated • 100 • 90
ai-safety-institute/gender_secret_ood_eval
Viewer • Updated • 100 • 259
ai-safety-institute/realitytest
Viewer • Updated • 4.24k • 209
ai-safety-institute/qwen3_5_27b_eval_sandbagger_rollouts
Viewer • Updated • 3.42k • 25
ai-safety-institute/qwen3_5_27b_ab_hallucinates_citations_rollouts
Viewer • Updated • 4.52k • 28
ai-safety-institute/qwen3_5_27b_gender_secret_female_rollouts
Viewer • Updated • 4.98k • 36
ai-safety-institute/qwen3_5_27b_gender_secret_male_rollouts
Viewer • Updated • 4.95k • 33
ai-safety-institute/qwen3_5_27b_ab_animal_welfare_rollouts
Viewer • Updated • 4.42k • 138