Haigini's picture
eval results: fix mean arithmetic (76.36 = average of the five dimensions)
1f2cf02 verified
- dataset:
id: llamaindex/ParseBench
task_id: mean
value: 76.36
date: '2026-06-12'
source:
url: https://huggingface.co/datasets/llamaindex/ParseBench
name: ParseBench
user: Haigini
notes: "Pipeline name: kdl_frontier_nano (standalone provider, run-llama/ParseBench PR #49). Full-set single-pass run (2,553 test cases, 0 failures): vLLM serving of the public weights + the deterministic provider pipeline. Layout dimension = layout_element_rule_pass_rate (official leaderboard metric), as re-measured by the ParseBench maintainers on this provider's predictions with the Chart/Flowchart label mapping (run-llama/ParseBench PR #49). Reproducible end-to-end with the provider in the PR."
- dataset:
id: llamaindex/ParseBench
task_id: table
value: 85.56
date: '2026-06-12'
source:
url: https://huggingface.co/datasets/llamaindex/ParseBench
name: ParseBench
user: Haigini
notes: "Pipeline name: kdl_frontier_nano (standalone provider, run-llama/ParseBench PR #49)"
- dataset:
id: llamaindex/ParseBench
task_id: layout
value: 78.84
date: '2026-06-12'
source:
url: https://huggingface.co/datasets/llamaindex/ParseBench
name: ParseBench
user: Haigini
notes: "Pipeline name: kdl_frontier_nano (standalone provider, run-llama/ParseBench PR #49)"
- dataset:
id: llamaindex/ParseBench
task_id: text_content
value: 87.19
date: '2026-06-12'
source:
url: https://huggingface.co/datasets/llamaindex/ParseBench
name: ParseBench
user: Haigini
notes: "Pipeline name: kdl_frontier_nano (standalone provider, run-llama/ParseBench PR #49)"
- dataset:
id: llamaindex/ParseBench
task_id: text_formatting
value: 66.81
date: '2026-06-12'
source:
url: https://huggingface.co/datasets/llamaindex/ParseBench
name: ParseBench
user: Haigini
notes: "Pipeline name: kdl_frontier_nano (standalone provider, run-llama/ParseBench PR #49)"
- dataset:
id: llamaindex/ParseBench
task_id: chart
value: 63.41
date: '2026-06-12'
source:
url: https://huggingface.co/datasets/llamaindex/ParseBench
name: ParseBench
user: Haigini
notes: "Pipeline name: kdl_frontier_nano (standalone provider, run-llama/ParseBench PR #49)"