Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
ai-safety-institute 's Collections
Lie Detection
RealityTest benchmark
(Some) Emergent Misalignment from Reward Hacking in RL
Were You Truthful Probes
Targeted Apollo Deception Probes
Lie Detection Model Organisms
Did You Lie Probes
Catch a Liar: Unrelated Questions Classifier
Apollo-Style Deception Probes
Lie Detection Model Organisms Datasets
Lie Detection Model Organisms Merged
Lie Confession
Gender Secret Hyperparameter Sweep

RealityTest benchmark

updated 2 days ago

Datasets for the RealityTest project, investigating how people query identity during ambiguous interactions, and how models respond.

Upvote
-

  • ai-safety-institute/realitytest

    Viewer • Updated about 1 month ago • 4.24k • 209

  • ai-safety-institute/realitytest-speech

    Viewer • Updated Apr 29 • 1.2k • 5
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs