Commit History

Add structural tag inference (Stage 3s) and compact eval output
a16e111

Claude commited on

Default min_why to strong_implied; add retrieval gap analysis script
4968635

Claude commited on

eval with expanded GT and leaf metrics
7995e65

Food Desert commited on

Normalize GT annotations: expand implications, exclude non-evaluable tags
14e5c38

Claude commited on

eval with min_why=explicit + implications
6fc4b56

Food Desert commited on

Add tag implication expansion (fox→canine→canid→mammal)
eeada1d

Claude commited on

eval results with min_why=explicit bug fix
de8b5a3

Food Desert commited on

Remove data/eval_results/ from .gitignore so eval results are tracked
3edd051

Claude commited on

Fix min_why not passed to workers in parallel eval mode
054dd0f

Claude commited on

Add latest eval results
096cdd3

Food Desert commited on

Add --min-why threshold to filter Stage 3 selections by confidence level
09a248d

Claude commited on

adding tag implications file
962e2b4

Food Desert commited on

Add diagnostic eval metrics, why-distribution tracking, and generic character filter
349b999

Claude commited on

Add n=10 eval results for analysis
df66964

Food Desert commited on

Add parallel processing to eval pipeline with ThreadPoolExecutor
12dfa28

Claude commited on

Add independent character tag metrics to eval pipeline
f1b4da2

Claude commited on

Improve eval harness: shuffle samples, always write results
133d74c

Claude commited on

Add end-to-end evaluation harness for pipeline metrics
6909d06

Claude commited on

Expand alias filter tests with real CSV data and pipeline tests
ea9e11c

Claude commited on

Add alias-based character tag filtering for Stage 3
c6be992

Food Desert commited on

initial commit
4fdda86
unverified

FoodDesert commited on