Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
qimma 's Collections
Qimma leaderboard datasets

Qimma leaderboard datasets

updated Apr 21

Benchmarks used in Qimma Leaderboard

Upvote
-

  • qimma/MCQ_ArabCulture

    Viewer • Updated Feb 17 • 3.48k • 74

  • qimma/MCQ_AraDiCE-Culture

    Viewer • Updated Mar 10 • 180 • 52

  • qimma/MCQ_ArabicMMLU

    Viewer • Updated Jan 30 • 13.7k • 74

  • qimma/QA_MedArabiQ

    Viewer • Updated Feb 10 • 200 • 100

  • qimma/MCQ_3LM-STEM

    Viewer • Updated Feb 18 • 2.61k • 58

  • qimma/MCQ_MedArabiQ

    Viewer • Updated Feb 12 • 299 • 47

  • qimma/MCQ_PalmX

    Viewer • Updated Feb 12 • 2.98k • 45

  • qimma/QA_ArabLegalEval

    Viewer • Updated Feb 12 • 79 • 38

  • qimma/MCQ_MedAraBench

    Viewer • Updated Feb 16 • 4.93k • 79

  • qimma/MCQ_GAT

    Viewer • Updated Feb 16 • 14k • 55

  • qimma/MCQ_MizanQA

    Viewer • Updated Feb 17 • 1.73k • 37

  • qimma/MCQ_AraTrust

    Viewer • Updated Feb 17 • 522 • 35

  • qimma/QA_fann_or_flop

    Viewer • Updated Feb 5 • 6.94k • 7

  • Are Arabic Benchmarks Reliable? QIMMA's Quality-First Approach to LLM Evaluation

    Paper • 2604.03395 • Published Apr 3 • 2

  • Running on CPU Upgrade
    16

    Qimma Leaderboard

    📊
    16

    Qimma leaderboard

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs