Spaces:

FoodDesert
/

Prompt_Squirrel_RAG

Running

App Files Files Community

Food Desert commited on May 7

Commit

d57f0a5

1 Parent(s): 65b7582

Add tooltip override system and curate top-1000 tooltip fixes

Browse files

Files changed (5) hide show

PROJECT_SUMMARY.md +27 -24
SESSION_QUICKSTART.md +5 -1
app.py +29 -1
data/tag_tooltip_overrides.csv +49 -0
docs/space_overview.md +11 -1

PROJECT_SUMMARY.md CHANGED Viewed

@@ -45,11 +45,11 @@ The system implements a **three-stage pipeline** with strict contracts between s
 - **Module**: `psq_rag.llm.select`
 - **Output**: Selected tags with optional rationale codes (explicit, strong_implied, weak_implied, style_or_meta, other)
-### Stage 3s: Structural Tag Inference (Optional)
-- **Purpose**: Infer structural/implied tags from selected tags
-- **Example**: clothing → topless/bottomless based on what clothing is present
-- **Implementation**: Group-based system using wiki data
-- **Module**: `psq_rag.llm.select.llm_infer_structural_tags`
 ---
@@ -85,7 +85,7 @@ Implementation of a categorized tag suggestion system based on the e621 tagging
 ### Tag Implication System (Feb 10-11, 2026)
 - Integrated `tag_implications-2023-07-20.csv`
-- Automatic tag expansion (e.g., fox → canine → canid → mammal)
 - Expanded ground truth annotations for evaluation
 - Leaf-only metrics to avoid penalizing implied tags
@@ -114,11 +114,12 @@ Implementation of a categorized tag suggestion system based on the e621 tagging
 ### Artifacts (loaded lazily)
 - **FastText embeddings**: Compressed format for semantic similarity
 - **TF-IDF vectors + SVD**: For context-based tag similarity
-- **Alias mappings**: Non-canonical → canonical tag projection
 - **Tag counts**: Frequency information from corpus
-- **Tag implications**: Hierarchical tag relationships (e.g., species → family)
-- **Tag groups** (`data/tag_groups.json`): Structured tag families for inference
-- **Tag wiki definitions** (`data/tag_wiki_defs.json`): E621 wiki data for tags
 ### Configuration
 - **tagging_checklist.txt**: E621 tagging guidelines and categories
@@ -181,9 +182,10 @@ Implementation of a categorized tag suggestion system based on the e621 tagging
 ### Sample Scripts
 - **scripts/rewrite_playground.py**: Stage 1 testing
-- **scripts/stage3_debug.py**: Stage 3 debugging
-- **scripts/test_categorized_suggestions.py**: Category suggestion testing
-- **scripts/test_parser_only.py**: Parser validation
 ---
@@ -196,7 +198,7 @@ Implementation of a categorized tag suggestion system based on the e621 tagging
 - Verbose retrieval reporting (optional)
 - NSFW tag filtering (configurable)
 - Final prompt composition with deduplication
-- Mascot branding (🐿️ squirrel)
 ### Configuration
 - `allow_nsfw_tags`: NSFW content filtering
@@ -211,7 +213,7 @@ Implementation of a categorized tag suggestion system based on the e621 tagging
 1. **Alias-to-Canonical Projection**: Handles non-canonical tag variants and projects them to e621 canonical forms
-2. **Head-Noun Expansion**: Automatically extracts head nouns from multi-word phrases (e.g., "big shirt" → also search "shirt")
 3. **Dual Scoring**: FastText semantic similarity + TF-IDF/SVD context similarity with weighted fusion
@@ -264,14 +266,15 @@ Core packages (requirements.txt):
 In this session (and recent sessions based on git history), we have:
-1. ✅ **Built tag categorization infrastructure** based on e621 checklist
-2. ✅ **Created category parser** with tier and constraint support
-3. ✅ **Implemented TF-IDF-based categorized suggestions**
-4. ✅ **Added comprehensive evaluation metrics** (per-category P/R/F1, ranking metrics)
-5. ✅ **Fixed multi-select constraint handling** for body_type, species, gender
-6. ✅ **Improved structural inference system** with group-based wiki data approach
-7. ✅ **Enhanced evaluation pipeline** with parallel processing and implication expansion
-8. ✅ **Added diagnostic and analysis tools** for debugging and quality assessment
-9. ✅ **Cleaned up binary files** and moved to proper XET storage on Hugging Face
 The project is now at a sophisticated stage with a full three-stage pipeline, comprehensive evaluation infrastructure, and category-based tag organization aligned with e621's tagging best practices.

 - **Module**: `psq_rag.llm.select`
 - **Output**: Selected tags with optional rationale codes (explicit, strong_implied, weak_implied, style_or_meta, other)
+### Stage 3s: Structural Tag Inference (Optional)
+- **Purpose**: Infer structural tags directly from prompt text using a fixed group definition set
+- **Example**: character count, body type, gender, clothing state, visual elements
+- **Implementation**: Group-based LLM inference with deterministic postprocessing (for example, trio implies group)
+- **Module**: `psq_rag.llm.select.llm_infer_structural_tags`
 ---
 ### Tag Implication System (Feb 10-11, 2026)
 - Integrated `tag_implications-2023-07-20.csv`
+- Automatic tag expansion (e.g., fox → canine → canid → mammal)
 - Expanded ground truth annotations for evaluation
 - Leaf-only metrics to avoid penalizing implied tags
 ### Artifacts (loaded lazily)
 - **FastText embeddings**: Compressed format for semantic similarity
 - **TF-IDF vectors + SVD**: For context-based tag similarity
+- **Alias mappings**: Non-canonical → canonical tag projection
 - **Tag counts**: Frequency information from corpus
+- **Tag implications**: Hierarchical tag relationships (e.g., species → family)
+- **Tag groups** (`data/tag_groups.json`): Structured tag families for inference
+- **Tag wiki definitions** (`data/tag_wiki_defs.json`): E621 wiki data for tags
+- **Tooltip overrides** (`data/tag_tooltip_overrides.csv`): Curated tooltip text fixes that override wiki definitions for specific tags
 ### Configuration
 - **tagging_checklist.txt**: E621 tagging guidelines and categories
 ### Sample Scripts
 - **scripts/rewrite_playground.py**: Stage 1 testing
+- **scripts/stage3_debug.py**: Stage 3 debugging
+- **scripts/test_categorized_suggestions.py**: Category suggestion testing
+- **scripts/test_parser_only.py**: Parser validation
+- **scripts/test_structural_trio_group_rule.py**: Regression test for `trio -> group` structural postprocess and eval sample invariant
 ---
 - Verbose retrieval reporting (optional)
 - NSFW tag filtering (configurable)
 - Final prompt composition with deduplication
+- Mascot branding (🐿️ squirrel)
 ### Configuration
 - `allow_nsfw_tags`: NSFW content filtering
 1. **Alias-to-Canonical Projection**: Handles non-canonical tag variants and projects them to e621 canonical forms
+2. **Head-Noun Expansion**: Automatically extracts head nouns from multi-word phrases (e.g., "big shirt" → also search "shirt")
 3. **Dual Scoring**: FastText semantic similarity + TF-IDF/SVD context similarity with weighted fusion
 In this session (and recent sessions based on git history), we have:
+1. ✅ **Built tag categorization infrastructure** based on e621 checklist
+2. ✅ **Created category parser** with tier and constraint support
+3. ✅ **Implemented TF-IDF-based categorized suggestions**
+4. ✅ **Added comprehensive evaluation metrics** (per-category P/R/F1, ranking metrics)
+5. ✅ **Fixed multi-select constraint handling** for body_type, species, gender
+6. ✅ **Improved structural inference system** with group-based wiki data approach
+7. ✅ **Enhanced evaluation pipeline** with parallel processing and implication expansion
+8. ✅ **Added diagnostic and analysis tools** for debugging and quality assessment
+9. ✅ **Cleaned up binary files** and moved to proper XET storage on Hugging Face
 The project is now at a sophisticated stage with a full three-stage pipeline, comprehensive evaluation infrastructure, and category-based tag organization aligned with e621's tagging best practices.

SESSION_QUICKSTART.md CHANGED Viewed

@@ -7,7 +7,7 @@ A RAG system that converts natural language prompts → e621-style tags for furr
 1. **Stage 1 (Rewrite)**: Natural language → tag-shaped phrases (LLM)
 2. **Stage 2 (Retrieval)**: Phrases → candidate tags (FastText + TF-IDF/SVD, closed vocab)
 3. **Stage 3 (Selection)**: Candidates → final selected tags (LLM)
-4. **Stage 3s (Structural)**: Selected tags → structural inferences (optional, e.g., clothing → topless)
 ## Latest Features (Feb 13-14, 2026)
 - **Tag Categorization**: Organized suggestions by e621 checklist categories (species, clothing, posture, etc.)
@@ -23,9 +23,11 @@ A RAG system that converts natural language prompts → e621-style tags for furr
 - `scripts/eval_categorized.py` - Per-category metrics
 - `scripts/analyze_threshold_grid.py` - Threshold grid analysis (score/global rank/phrase rank)
 - `scripts/analyze_caption_evident_audit.py` - Caption-evident audit vs retrieval
 - `docs/retrieval_contract.md` - Stage 2 spec
 - `docs/stage3_contract.md` - Stage 3 spec
 - `tagging_checklist.txt` - E621 tagging guidelines
 ## Running Code
 ```bash
@@ -74,12 +76,14 @@ ls -la data/eval_results/
 - **Caption-evident audit**: Run `scripts/analyze_caption_evident_audit.py`
 - **Test retrieval**: Use `scripts/smoke_test.py`
 - **Debug Stage 3**: Use `scripts/stage3_debug.py` (`--phrases` optional; omitted runs Stage 1 rewrite first, then Stage 2 retrieval from rewritten phrases)
 ## Data Artifacts (Lazy-loaded)
 - FastText embeddings (semantic similarity)
 - TF-IDF + SVD matrices (context similarity)
 - Alias → canonical tag mappings
 - Tag counts, implications, groups, wiki definitions
 ## Eval Datasets
 - `data/eval_samples/e621_sfw_sample_1000_seed123_buffer10000_expanded.jsonl` - Base eval set (implication-expanded GT)

 1. **Stage 1 (Rewrite)**: Natural language → tag-shaped phrases (LLM)
 2. **Stage 2 (Retrieval)**: Phrases → candidate tags (FastText + TF-IDF/SVD, closed vocab)
 3. **Stage 3 (Selection)**: Candidates → final selected tags (LLM)
+4. **Stage 3s (Structural)**: Prompt text → structural tags (LLM over fixed groups) plus deterministic postprocess rules (for example, `trio` implies `group`)
 ## Latest Features (Feb 13-14, 2026)
 - **Tag Categorization**: Organized suggestions by e621 checklist categories (species, clothing, posture, etc.)
 - `scripts/eval_categorized.py` - Per-category metrics
 - `scripts/analyze_threshold_grid.py` - Threshold grid analysis (score/global rank/phrase rank)
 - `scripts/analyze_caption_evident_audit.py` - Caption-evident audit vs retrieval
+- `scripts/test_structural_trio_group_rule.py` - Regression test for structural `trio -> group` mapping and eval sample invariant
 - `docs/retrieval_contract.md` - Stage 2 spec
 - `docs/stage3_contract.md` - Stage 3 spec
 - `tagging_checklist.txt` - E621 tagging guidelines
+- `data/tag_tooltip_overrides.csv` - Manual tooltip text overrides (`tag, tooltip_override`)
 ## Running Code
 ```bash
 - **Caption-evident audit**: Run `scripts/analyze_caption_evident_audit.py`
 - **Test retrieval**: Use `scripts/smoke_test.py`
 - **Debug Stage 3**: Use `scripts/stage3_debug.py` (`--phrases` optional; omitted runs Stage 1 rewrite first, then Stage 2 retrieval from rewritten phrases)
+- **Fix tooltip definitions**: Add/adjust rows in `data/tag_tooltip_overrides.csv` instead of editing extracted `data/tag_wiki_defs.json`
 ## Data Artifacts (Lazy-loaded)
 - FastText embeddings (semantic similarity)
 - TF-IDF + SVD matrices (context similarity)
 - Alias → canonical tag mappings
 - Tag counts, implications, groups, wiki definitions
+- Tooltip override CSV for curated definition fixes
 ## Eval Datasets
 - `data/eval_samples/e621_sfw_sample_1000_seed123_buffer10000_expanded.jsonl` - Base eval set (implication-expanded GT)

app.py CHANGED Viewed

@@ -118,6 +118,32 @@ def _load_tag_wiki_defs() -> Dict[str, str]:
         return {}
 @lru_cache(maxsize=1)
 def _load_about_docs_markdown() -> str:
     candidates = [
@@ -156,7 +182,9 @@ def _tooltip_text_for_tag(tag: str) -> str:
         count = None
     if isinstance(count, int):
         parts.append(f"Count: {count:,}")
-    d = _load_tag_wiki_defs().get(t, "")
     if d:
         parts.append(d)
     return "\n".join(parts).strip()

         return {}
+@lru_cache(maxsize=1)
+def _load_tag_tooltip_overrides() -> Dict[str, str]:
+    """Load optional per-tag tooltip text overrides.
+    File format:
+      data/tag_tooltip_overrides.csv
+      columns: tag, tooltip_override
+    """
+    p = Path("data/tag_tooltip_overrides.csv")
+    if not p.exists():
+        return {}
+    try:
+        out: Dict[str, str] = {}
+        with p.open("r", encoding="utf-8", newline="") as f:
+            reader = csv.DictReader(f)
+            for row in reader:
+                tag = _norm_tag_for_lookup(str(row.get("tag", "")))
+                text = " ".join(str(row.get("tooltip_override", "")).split())
+                if tag and text:
+                    out[tag] = text
+        return out
+    except Exception:
+        return {}
 @lru_cache(maxsize=1)
 def _load_about_docs_markdown() -> str:
     candidates = [
         count = None
     if isinstance(count, int):
         parts.append(f"Count: {count:,}")
+    d = _load_tag_tooltip_overrides().get(t, "")
+    if not d:
+        d = _load_tag_wiki_defs().get(t, "")
     if d:
         parts.append(d)
     return "\n".join(parts).strip()

data/tag_tooltip_overrides.csv ADDED Viewed

	@@ -0,0 +1,49 @@

+tag,tooltip_override
+trio,"Exactly three characters are visible in the image. In e621 practice, trio is often co-tagged with group."
+nintendo,"Nintendo is a Japanese video game company and publisher."
+pokemon,"Pokemon is a media franchise centered on collecting, training, and battling Pokemon species."
+vaginal,"Used for posts depicting vaginal penetration or other explicitly vaginal sexual focus."
+piercing,"Posts depicting visible body piercings (ear, facial, nipple, genital, etc.)."
+animal_penis,"Tag for penises with clearly animal (non-humanoid) anatomy."
+plant,"Tag for plants or flora (realistic or stylized) depicted in the image."
+furniture,"Movable household or interior objects such as chairs, tables, beds, sofas, and cabinets."
+ear_piercing,"A visible piercing on the ear (including lobe or cartilage)."
+solo_focus,"In a multi-character image, one character is clearly emphasized as the main subject."
+bird,"A member of class Aves (avian species), including realistic or anthro birds."
+by_conditional_dnp,"Used for posts by artists with conditional do-not-post (DNP) restrictions."
+bedroom_eyes,"Half-lidded or narrowed eyes used in a seductive or flirtatious context."
+pink_nipples,"Nipples that are visibly pink."
+by_avoid_posting,"Artist/admin tag for creators on the avoid-posting list (do-not-post status)."
+grey_background,"An image with a predominantly grey background."
+pokemorph,"A character depicted as a human-like version of a Pokemon species."
+after_sex,"Depicts a moment immediately after sexual activity."
+by_unknown_artist,"Artist tag used when the original artist is unknown."
+generation_6_pokemon,"Pokemon species introduced in Generation 6 (X/Y)."
+crossgender,"A character depicted as a different gender than their usual depiction."
+<3_eyes,"Eyes that are heart-shaped or contain heart symbols."
+animal_crossing,"The Animal Crossing video game franchise."
+cum_while_penetrated,"Ejaculation while the character is being penetrated."
+animal_pussy,"Genitals with clearly animal (non-humanoid) vulva/pussy anatomy."
+generation_7_pokemon,"Pokemon species introduced in Generation 7 (Sun/Moon)."
+pokephilia,"Sexual or romantic activity involving a Pokemon and a non-Pokemon character."
+breast_squish,"Breasts visibly pressed or squished against a surface, object, or character."
+bow_ribbon,"A bow made from ribbon."
+featureless_crotch,"Crotch area shown without visible genital detail where it would normally be expected."
+blue_background,"An image with a predominantly blue background."
+fluttershy_(mlp),"Fluttershy, a main character from My Little Pony: Friendship is Magic."
+by_third-party_edit,"Artist/admin tag used when a post was edited by someone other than the original artist."
+rainbow_dash_(mlp),"Rainbow Dash, a main character from My Little Pony: Friendship is Magic."
+saliva_string,"A visible strand of saliva stretching between a mouth/tongue and another surface."
+male/ambiguous,"Sexual or romantic activity between a male character and an ambiguous-gender character."
+hair_bow,"A bow accessory worn on or in the hair."
+bow_tie,"A neck accessory tied in a bow shape."
+rarity_(mlp),"Rarity, a main character from My Little Pony: Friendship is Magic."
+bow_accessory,"An accessory that features a bow."
+patreon,"Patreon, a membership and crowdfunding platform."
+pinkie_pie_(mlp),"Pinkie Pie, a main character from My Little Pony: Friendship is Magic."
+warcraft,"The Warcraft video game franchise by Blizzard Entertainment."
+generation_8_pokemon,"Pokemon species introduced in Generation 8 (Sword/Shield)."
+sexual_barrier_device,"A barrier contraceptive used during sexual activity (for example condoms or dental dams)."
+applejack_(mlp),"Applejack, a main character from My Little Pony: Friendship is Magic."
+princess_celestia_(mlp),"Princess Celestia, an alicorn ruler character from My Little Pony."
+princess_luna_(mlp),"Princess Luna, an alicorn ruler character from My Little Pony."

docs/space_overview.md CHANGED Viewed

@@ -18,7 +18,7 @@ Design goals:
 - `Rewrite`:
   Turns the user prompt into short, tag-like pseudo-phrases that are easier to match in vector retrieval. These phrases are optimized as search queries for candidate lookup.
 - `Structural Inference`:
-  Runs an LLM call over a fixed set of high-level structure tags (for example character count, body type, gender, clothing state, gaze/text). It outputs only the structural tags it believes are supported.
 - `Probe Inference`:
   Runs a separate LLM call over a small, curated set of informative tags. This is a targeted check for tags that are often useful for reranking and final selection.
 - `Retrieval Candidates`:
@@ -67,6 +67,16 @@ The current Space does not claim to solve these other domains directly. Porting
 - Tag implications graph
 - Group/category mappings for row display
 - Optional wiki definitions (used for hover help)
 ## Technologies Used

 - `Rewrite`:
   Turns the user prompt into short, tag-like pseudo-phrases that are easier to match in vector retrieval. These phrases are optimized as search queries for candidate lookup.
 - `Structural Inference`:
+  Runs an LLM call over a fixed set of high-level structure tags (for example character count, body type, gender, clothing state, gaze/text). It outputs only the structural tags it believes are supported, then applies deterministic postprocessing rules (for example `trio` implies `group`).
 - `Probe Inference`:
   Runs a separate LLM call over a small, curated set of informative tags. This is a targeted check for tags that are often useful for reranking and final selection.
 - `Retrieval Candidates`:
 - Tag implications graph
 - Group/category mappings for row display
 - Optional wiki definitions (used for hover help)
+- Optional tooltip override table (`data/tag_tooltip_overrides.csv`) for manual corrections when wiki extraction is noisy
+## Tooltip Text Resolution
+Tooltip text for tags resolves in this order:
+1. `data/tag_tooltip_overrides.csv` (`tag, tooltip_override`) when a tag has a curated override
+2. `data/tag_wiki_defs.json` as fallback
+This keeps the extracted wiki file immutable while allowing targeted manual fixes for frequent/high-impact tags.
 ## Technologies Used