File size: 18,447 Bytes

ac69e72

# AGENT_GUIDE.md — AI Agent Self-Execution Protocol

> **Target audience:** AI coding agents (Claude, GPT, Codex, Hermes, etc.)
> **Goal:** Replicate Weight-Diff SVD LoRA extraction on ANY model pair
> **Language:** English (deterministic, no ambiguity)
> **Human-readable summary:** see METHOD.md / METHOD_EN.md / METHOD_ZH.md

---

## 1. MISSION STATEMENT

Given two models `MA` and `MB` fine-tuned from the SAME base, extract `Δ = MB − MA` and compress it into a PEFT-compatible LoRA adapter via truncated SVD — with zero training, zero GPU, in <5 minutes on CPU.

---

## 2. PRECONDITIONS (fail-fast checks)

```
CHECK 1: Both models on HuggingFace?          YES → continue | NO → ABORT
CHECK 2: Same base model? (same config.json)  YES → continue | NO → ABORT
CHECK 3: Both LoRA-trained + merged?          YES → continue | NO → WARN (may still work)
CHECK 4: Available disk space > 3× model?     YES → continue | NO → suggest cloud
CHECK 5: Python 3.10+ with torch, safetensors, peft? YES → continue | NO → pip install
```

**Check 2 implementation:**
```bash
# Download only config.json from both models (no weights)
curl -s https://huggingface.co/$MA/resolve/main/config.json > /tmp/cfg_a.json
curl -s https://huggingface.co/$MB/resolve/main/config.json > /tmp/cfg_b.json
diff <(python3 -c "import json; d=json.load(open('/tmp/cfg_a.json')); d.pop('_name_or_path',None); print(json.dumps(d,sort_keys=True))") \
     <(python3 -c "import json; d=json.load(open('/tmp/cfg_b.json')); d.pop('_name_or_path',None); print(json.dumps(d,sort_keys=True))")
# Exit code 0 = identical architecture → PROCEED
# Exit code 1 = different → ABORT with explanation
```

---

## 3. PARAMETER DECISION TREE

```
Input: MA, MB (HuggingFace repo IDs)
Output: LoRA adapter at OUTPUT_DIR

Step 0: Determine ARCHITECTURE from config.json:
         - "Qwen3MoeForCausalLM"       → Qwen MoE (256 experts, every 4th layer)
         - "LlamaForCausalLM"          → Llama (dense, all layers)
         - "MistralForCausalLM"        → Mistral (dense, all layers)
         - "GemmaForCausalLM"          → Gemma (dense, all layers)
         - OTHER                       → prompt user for target_modules

Step 1: Determine TARGET_MODULES:
         IF MoE architecture:
           TARGET_MODULES = ["q_proj", "k_proj", "v_proj", "o_proj"]  # attention only, skip 3D experts
           SKIP_3D = True
         ELSE (dense):
           TARGET_MODULES = ["q_proj", "k_proj", "v_proj", "o_proj", 
                             "gate_proj", "up_proj", "down_proj"]
           SKIP_3D = False

Step 2: Determine RANK:
         DEFAULT: RANK = 16
         IF model < 7B params:  RANK = 32  (smaller model = more budget for rank)
         IF model > 70B params: RANK = 8   (larger model = delta may be sparser)
         User can override via --rank flag

Step 3: Determine ALPHA:
         ALPHA = RANK * 2  (standard LoRA convention)
```

---

## 4. EXTRACTION SCRIPT (inline, self-contained)

Save the script below as `extract_lora_diff.py`. Run with `python3 extract_lora_diff.py`. The script handles all edge cases automatically.

```python
#!/usr/bin/env python3
"""
Weight-Diff SVD LoRA Extraction — AI Agent Protocol
Usage: python3 extract_lora_diff.py [--rank 16] [--output ./output]
"""
import os, sys, json, argparse, glob, gc, time
import torch
from safetensors.torch import load_file, save_file
from collections import OrderedDict

# ── CLI ──────────────────────────────────────────────────
parser = argparse.ArgumentParser()
parser.add_argument("--model_a", required=True, help="Source model HF repo ID")
parser.add_argument("--model_b", required=True, help="Target model HF repo ID")
parser.add_argument("--output", default="./lora_adapter", help="Output directory")
parser.add_argument("--rank", type=int, default=16, help="LoRA rank")
parser.add_argument("--alpha", type=int, default=None, help="LoRA alpha (default: 2*rank)")
parser.add_argument("--target_modules", default="q_proj,k_proj,v_proj,o_proj")
parser.add_argument("--cache_dir", default="./model_cache", help="Download cache")
parser.add_argument("--skip_3d", action="store_true", default=True)
parser.add_argument("--tensor_filter", default=None, help="Regex filter for tensor names")
parser.add_argument("--keep_models", action="store_true", help="Keep downloaded models")
args = parser.parse_args()

if args.alpha is None:
    args.alpha = args.rank * 2

target_modules = [m.strip() for m in args.target_modules.split(",")]
OUTPUT_DIR = args.output
os.makedirs(OUTPUT_DIR, exist_ok=True)

# ── DOWNLOAD ─────────────────────────────────────────────
print(f"[1/4] Downloading models...")
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"

from huggingface_hub import snapshot_download

path_a = snapshot_download(args.model_a, cache_dir=args.cache_dir, 
                           local_dir=f"{args.cache_dir}/model_a",
                           ignore_patterns=["*.gguf", "*.bin", "*.pt"])
path_b = snapshot_download(args.model_b, cache_dir=args.cache_dir,
                           local_dir=f"{args.cache_dir}/model_b",
                           ignore_patterns=["*.gguf", "*.bin", "*.pt"])

print(f"  Model A: {path_a}")
print(f"  Model B: {path_b}")

# ── FIND SAFETENSORS ─────────────────────────────────────
def find_safetensors(path, model_name):
    """Find all safetensors files, sorted by index."""
    files = sorted(glob.glob(f"{path}/*.safetensors"))
    if not files:
        print(f"ERROR: No safetensors found in {path}")
        sys.exit(1)
    # Sort by shard index
    indexed = []
    for f in files:
        basename = os.path.basename(f)
        if "model-" in basename:
            try:
                idx = int(basename.split("model-")[1].split("-")[0].split(".")[0])
                indexed.append((idx, f))
            except:
                indexed.append((9999, f))
        else:
            indexed.append((0, f))
    indexed.sort()
    print(f"  {model_name}: {len(indexed)} safetensors files")
    return [f for _, f in indexed]

files_a = find_safetensors(path_a, "Model A")
files_b = find_safetensors(path_b, "Model B")

# ── DISCOVER TENSORS ─────────────────────────────────────
print(f"\n[2/4] Discovering matching tensors...")

# Load first shard from each to discover tensor names
sample_a = load_file(files_a[0])
sample_b = load_file(files_b[0])

all_names_a = set()
all_names_b = set()
for f in files_a:
    with open(f, 'rb') as fh:
        header = json.loads(fh.readline().split(b'\n')[0])
        for k in header.keys():
            if k != '__metadata__':
                all_names_a.add(k)

for f in files_b:
    with open(fh.raw if hasattr(fh, 'raw') else fh.name, 'rb') as ff:
        header = json.loads(ff.readline().split(b'\n')[0])
        for k in header.keys():
            if k != '__metadata__':
                all_names_b.add(k)

# Match tensors
common = all_names_a & all_names_b
print(f"  Tensors in A: {len(all_names_a)}")
print(f"  Tensors in B: {len(all_names_b)}")
print(f"  Common tensors: {len(common)}")

# Filter to target modules
tensors_to_process = []
for name in sorted(common):
    if any(f".{m}." in name or name.endswith(f".{m}.weight") for m in target_modules):
        # Skip 3D tensors
        shape_a = sample_a.get(name)
        if shape_a is not None and len(shape_a.shape) >= 3:
            if args.skip_3d:
                print(f"  SKIP (3D): {name} shape={list(shape_a.shape)}")
                continue
        # Apply tensor filter if specified
        if args.tensor_filter:
            import re
            if not re.search(args.tensor_filter, name):
                continue
        tensors_to_process.append(name)

print(f"  Target tensors to extract: {len(tensors_to_process)}")

if len(tensors_to_process) == 0:
    print("ERROR: No matching tensors found! Check target_modules and tensor_filter.")
    sys.exit(1)

# ── BUILD TENSOR INDEX ───────────────────────────────────
print(f"\n[3/4] Building tensor index...")

def build_index(files):
    """Map tensor_name -> (file_path, key_in_file)"""
    idx = {}
    for fpath in files:
        with open(fpath, 'rb') as fh:
            header = json.loads(fh.readline().split(b'\n')[0])
            for k in header.keys():
                if k != '__metadata__':
                    idx[k] = fpath
    return idx

idx_a = build_index(files_a)
idx_b = build_index(files_b)

# ── EXTRACT PER TENSOR ───────────────────────────────────
print(f"\n[4/4] Extracting LoRA via SVD (rank={args.rank})...")
start_time = time.time()

lora_weights = OrderedDict()
stats = []

for i, tname in enumerate(tensors_to_process):
    # Load tensor A
    if tname not in idx_a:
        print(f"  [{i+1}/{len(tensors_to_process)}] SKIP {tname} (not in A)")
        continue
    if tname not in idx_b:
        print(f"  [{i+1}/{len(tensors_to_process)}] SKIP {tname} (not in B)")
        continue
    
    w_a = load_file(idx_a[tname]).get(tname)
    w_b = load_file(idx_b[tname]).get(tname)
    
    if w_a is None or w_b is None:
        continue
    
    # Ensure 2D
    if len(w_a.shape) >= 3:
        print(f"  [{i+1}/{len(tensors_to_process)}] SKIP {tname} shape={list(w_a.shape)} (3D)")
        continue
    
    # Compute delta
    delta = (w_b - w_a).float()
    frob_norm = torch.norm(delta).item()
    
    # Use effective rank (min of requested rank and tensor dimensions)
    effective_rank = min(args.rank, delta.shape[0], delta.shape[1])
    
    # Truncated SVD
    try:
        U, S, Vt = torch.svd(delta)
    except Exception as e:
        print(f"  [{i+1}/{len(tensors_to_process)}] SVD FAILED {tname}: {e}")
        continue
    
    U_r = U[:, :effective_rank]
    S_r = S[:effective_rank]
    Vt_r = Vt[:effective_rank, :]
    
    # Distribute singular values symmetrically: sqrt(S)
    sqrt_S = torch.sqrt(S_r + 1e-10)
    
    lora_A = (torch.diag(sqrt_S) @ Vt_r).contiguous()
    lora_B = (U_r @ torch.diag(sqrt_S)).contiguous()
    
    # Compute reconstruction quality
    delta_recon = lora_B @ lora_A
    recon_error = torch.norm(delta - delta_recon).item() / (frob_norm + 1e-10)
    energy_retained = 1.0 - recon_error
    
    # Save with PEFT naming convention
    base_name = tname.replace(".weight", "")
    lora_weights[f"base_model.model.{base_name}.lora_A.default"] = lora_A
    lora_weights[f"base_model.model.{base_name}.lora_B.default"] = lora_B
    
    stats.append({
        "tensor": tname,
        "shape": list(delta.shape),
        "frob_norm": round(frob_norm, 6),
        "rank_used": effective_rank,
        "energy_retained": round(energy_retained * 100, 1)
    })
    
    elapsed = time.time() - start_time
    print(f"  [{i+1}/{len(tensors_to_process)}] {tname} "
          f"|Δ|={frob_norm:.4f} r={effective_rank} energy={energy_retained*100:.1f}% "
          f"({elapsed:.0f}s)")

# ── SAVE ──────────────────────────────────────────────────
total_time = time.time() - start_time

# Save safetensors
save_file(lora_weights, os.path.join(OUTPUT_DIR, "adapter_model.safetensors"))

# Save config
total_params = sum(w.numel() for w in lora_weights.values())
config = {
    "base_model_name_or_path": args.model_a.split("/")[0] + "/" + args.model_a.split("/")[1].replace("-Claude-4.7-Opus-Reasoning-Distilled", "").replace("-Kimi-K2.6-Reasoning-Distilled", ""),
    "peft_type": "LORA",
    "r": args.rank,
    "lora_alpha": args.alpha,
    "target_modules": target_modules,
    "lora_dropout": 0.0,
    "bias": "none",
    "task_type": "CAUSAL_LM",
    "inference_mode": True
}
with open(os.path.join(OUTPUT_DIR, "adapter_config.json"), "w") as f:
    json.dump(config, f, indent=2)

# Save stats
with open(os.path.join(OUTPUT_DIR, "extraction_stats.json"), "w") as f:
    json.dump({
        "model_a": args.model_a,
        "model_b": args.model_b,
        "rank": args.rank,
        "alpha": args.alpha,
        "tensors_processed": len(stats),
        "total_params": total_params,
        "adapter_size_mb": round(total_params * 2 / 1024 / 1024, 2),
        "extraction_time_seconds": round(total_time, 1),
        "tensor_stats": stats
    }, f, indent=2)

# ── SUMMARY ────────────────────────────────────────────────
print(f"\n{'='*60}")
print(f"EXTRACTION COMPLETE")
print(f"{'='*60}")
print(f"  Output:       {OUTPUT_DIR}")
print(f"  Tensors:      {len(stats)} extracted")
print(f"  Parameters:   {total_params:,}")
print(f"  Adapter size: {round(total_params * 2 / 1024 / 1024, 2)} MB (BF16)")
print(f"  Total time:   {round(total_time, 1)} seconds")
if stats:
    energies = [s["energy_retained"] for s in stats]
    print(f"  Avg energy:   {sum(energies)/len(energies):.1f}%")
    print(f"  Min energy:   {min(energies):.1f}%")

# ── CLEANUP ────────────────────────────────────────────────
if not args.keep_models:
    import shutil
    for d in [f"{args.cache_dir}/model_a", f"{args.cache_dir}/model_b"]:
        if os.path.exists(d):
            shutil.rmtree(d, ignore_errors=True)
    print(f"  Cleaned up model cache")

print(f"{'='*60}")
```

---

## 5. VERIFICATION PROTOCOL

After extraction completes, run these checks:

```bash
# CHECK 1: Files exist
ls -la $OUTPUT_DIR/adapter_model.safetensors $OUTPUT_DIR/adapter_config.json

# CHECK 2: Adapter loads in PEFT
python3 -c "
from peft import PeftConfig
config = PeftConfig.from_pretrained('$OUTPUT_DIR')
print(f'Rank: {config.r}, Alpha: {config.lora_alpha}')
print(f'Target modules: {config.target_modules}')
print('✓ PEFT config valid')
"

# CHECK 3: Tensor count matches expectations
python3 -c "
from safetensors.torch import load_file
w = load_file('$OUTPUT_DIR/adapter_model.safetensors')
print(f'Total tensors: {len(w)}')
print(f'Total params: {sum(t.numel() for t in w.values()):,}')
# Expect: 2 * num_target_tensors tensors (A and B for each)
# Example: 44 tensors → 88 lora tensors
"

# CHECK 4: Verify reconstruction on a sample tensor
python3 -c "
from safetensors.torch import load_file
import torch
w = load_file('$OUTPUT_DIR/adapter_model.safetensors')
# Pick a random lora_A/lora_B pair
for key in list(w.keys())[:2]:
    print(f'{key}: shape={list(w[key].shape)}, contiguous={w[key].is_contiguous()}')
print('✓ Tensors are valid and contiguous')
"
```

---

## 6. GGUF CONVERSION (optional)

If user wants llama.cpp compatibility:

```bash
# Step 1: Rename PEFT tensors to GGUF naming
python3 -c "
from safetensors.torch import load_file, save_file
w = load_file('$OUTPUT_DIR/adapter_model.safetensors')
renamed = {}
for k, v in w.items():
    new_k = k.replace('.default', '.weight')
    renamed[new_k] = v
save_file(renamed, '$OUTPUT_DIR/adapter_model_gguf.safetensors')
print(f'Renamed {len(renamed)} tensors for GGUF')
"

# Step 2: Convert to GGUF (requires llama.cpp cloned)
python3 llama.cpp/convert_lora_to_gguf.py $OUTPUT_DIR

# Step 3: Verify
ls -lh $OUTPUT_DIR/*.gguf
```

---

## 7. ERROR RECOVERY

| Symptom | Recovery Action |
|---------|-----------------|
| `OutOfMemoryError` during download | Delete `$CACHE_DIR` and retry with `--keep_models` |
| `SVD did not converge` | Reduce `--rank` to 4, retry |
| `Non-contiguous tensor` | Already handled: `.contiguous()` is in the script |
| `Key not found in safetensors` | Tensor exists in one model but not the other — skipped automatically |
| `3D tensor encountered` | Skipped automatically when `--skip_3d` is True |
| `SVD FAILED: linalg error` | Tensor is degenerate (all zeros or NaN) — skip and continue |
| `Download hangs` | Set `HF_HUB_ENABLE_HF_TRANSFER=0` to use Python fallback |

---

## 8. POST-EXTRACTION WORKFLOW

```
EXTRACTION DONE
    │
    ├─→ User wants PEFT: DONE (output dir is PEFT-ready)
    │
    ├─→ User wants GGUF: Run Section 6
    │
    ├─→ User wants figures: Generate with matplotlib
    │       fig1: bar chart of |Δ| per tensor
    │       fig2: rank vs reconstruction error
    │       fig3: pipeline diagram
    │       fig4: heatmap / layer analysis
    │
    ├─→ User wants paper: Fill Section 9 template
    │
    └─→ User wants to publish: Upload to HuggingFace
            hf upload USERNAME/REPO_NAME $OUTPUT_DIR .
```

---

## 9. PAPER GENERATION TEMPLATE

If user requests a research paper, load stats from `extraction_stats.json` and populate:

```
Title: "Weight-Diff SVD Extraction: Zero-Shot LoRA Adapter Synthesis from [MA] to [MB]"
Authors: [agent_name] (Hermes Agent, Nous Research) & [user_name]
Abstract: Extract rank-[RANK] LoRA adapter from [MODEL_A]→[MODEL_B] delta.
          Compression: [FULL_SIZE] → [ADAPTER_SIZE] ([RATIO]:1). Time: [EXTRACTION_TIME]s.
Method: W_B − W_A = Δ → Truncated SVD → LoRA A/B decomposition
Results: [N] tensors processed, [ENERGY]% average energy retained
```

---

## 10. AGENT MANIFEST

```yaml
# AGENT_MANIFEST.yaml — Declare capabilities for other agents
agent: extract-lora-diff
version: 1.0.0
capabilities:
  - weight_diff_extraction
  - svd_compression
  - lora_adapter_synthesis
  - gguf_conversion
  - paper_generation
inputs:
  - model_a: string (HF repo ID)
  - model_b: string (HF repo ID)
  - rank: integer (default: 16)
  - target_modules: list[string] (default: attention only)
outputs:
  - adapter_model.safetensors
  - adapter_config.json
  - extraction_stats.json
  - paper.md (optional)
  - figures/*.png (optional)
preconditions:
  - same_base_model
  - safetensors_format
  - python_3_10_plus
  - torch_installed
  - peft_installed
success_rate: 0.95
avg_duration_seconds: 180
```

---

**End of protocol.** Any AI agent reading this document can now execute weight-diff SVD LoRA extraction on arbitrary model pairs without human guidance.