Legal-i Claude Opus 4.7 (1M context) commited on
Commit
7e734b4
·
1 Parent(s): 4c9a9dd

feat(Phase 3): wire 5th dataset repo for general shard

Browse files

TAU_RAG_EXTRA_SHARDS_REPOS now CSV of all 5 dataset repos:
legal-eye-shards-extra → administrative (~161MB)
legal-eye-shards-extra-2 → criminal (~944MB)
legal-eye-shards-extra-3 → constitutional (~453MB)
legal-eye-shards-extra-4 → procedure (~911MB)
legal-eye-shards-extra-5 → general (~641MB) ← NEW

After Space rebuild, all 15 shards live across 1 Space + 5 datasets.
Total corpus: 16,595 (Tier A curated) + 732,540 (Tier B sharded)
= 749,135 docs (100% of original PII-redacted parquet).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Files changed (1) hide show
  1. Dockerfile +1 -1
Dockerfile CHANGED
@@ -54,7 +54,7 @@ ENV PYTHONPATH=/app \
54
  TAU_RAG_AUTOLOAD_CORPUS=1 \
55
  TAU_RAG_AUTOLOAD_CORPUS_PATH=/app/tau_rag/runtime/uploads/corpus_paid.jsonl \
56
  TAU_RAG_TIER=paid \
57
- TAU_RAG_EXTRA_SHARDS_REPOS=Legal-i/legal-eye-shards-extra,Legal-i/legal-eye-shards-extra-2,Legal-i/legal-eye-shards-extra-3,Legal-i/legal-eye-shards-extra-4 \
58
  TAU_RAG_EXTRA_SHARDS_CACHE=/tmp/legal_eye_extra_shards \
59
  TAU_RAG_CLUSTER_AUGMENT=1 \
60
  TAU_RAG_AUTH_REQUIRED=true \
 
54
  TAU_RAG_AUTOLOAD_CORPUS=1 \
55
  TAU_RAG_AUTOLOAD_CORPUS_PATH=/app/tau_rag/runtime/uploads/corpus_paid.jsonl \
56
  TAU_RAG_TIER=paid \
57
+ TAU_RAG_EXTRA_SHARDS_REPOS=Legal-i/legal-eye-shards-extra,Legal-i/legal-eye-shards-extra-2,Legal-i/legal-eye-shards-extra-3,Legal-i/legal-eye-shards-extra-4,Legal-i/legal-eye-shards-extra-5 \
58
  TAU_RAG_EXTRA_SHARDS_CACHE=/tmp/legal_eye_extra_shards \
59
  TAU_RAG_CLUSTER_AUGMENT=1 \
60
  TAU_RAG_AUTH_REQUIRED=true \