File size: 7,438 Bytes
7c4374c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 | =================================================================
BASE TIER SOUP ANALYSIS
Device: cuda
=================================================================
Loading checkpoint...
Loaded: mAP=0.825 cv=0.3117 epoch=19
clip_l14_openai loaded
dinov2_b14 loaded
siglip_b16_384 loaded
Running inference on 5000 val images...
Done: fused=torch.Size([5000, 128]) tri=torch.Size([5000, 256])
=================================================================
SCAN 1: ANCHOR GEOMETRY
=================================================================
Anchor pairwise cosine:
mean=0.0356 std=0.1896
max=0.9542 min=-0.9093
Max neighbor cosine per anchor:
mean=0.6949 std=0.2730
max=0.9542 min=0.0639
Anchor norms: mean=1.000000 std=0.000000
Anchor spectral: eff_rank=65.7/128
sv_max=5.0231 sv_10=2.8655 sv_50=0.9697 sv_min=0.125017
Anchor pentachoron CV: 0.2478
mean_vol=0.074751 std_vol=0.018524
=================================================================
SCAN 2: ANCHOR UTILIZATION
=================================================================
Active anchors: 1/256 (0.4%)
Visit counts: mean=19.5 std=312.5
max=5000 min=0
top 10: [5000, 0, 0, 0, 0, 0, 0, 0, 0, 0]
bottom 10: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Anchor entropy: -0.0000 / 5.5452 (-0.0%)
Per-anchor embedding density:
Intra-cluster cosine: mean=0.9693 std=0.0000
=================================================================
SCAN 3: PROJECTOR ANALYSIS
=================================================================
clip_l14:
norm: mean=1.000000 (should be 1.0)
self-sim off-diag: 0.9668
eff_dim: 24.1/128
dinov2_b14:
norm: mean=1.000000 (should be 1.0)
self-sim off-diag: 0.9678
eff_dim: 25.3/128
siglip_b16:
norm: mean=1.000000 (should be 1.0)
self-sim off-diag: 0.9501
eff_dim: 23.8/128
Expert agreement (cosine in 128-d):
clip_l14 × dinov2_b14 : mean=0.9898 std=0.0066 min=0.8730
clip_l14 × siglip_b16 : mean=0.9893 std=0.0052 min=0.9307
dinov2_b14 × siglip_b16 : mean=0.9855 std=0.0081 min=0.8920
Per-expert nearest anchor agreement:
clip_l14 × dinov2_b14 : same_anchor=1.0000 (100.0%)
clip_l14 × siglip_b16 : same_anchor=1.0000 (100.0%)
dinov2_b14 × siglip_b16 : same_anchor=1.0000 (100.0%)
Projector weight comparison:
clip_l14 : norm=37.4114 eff_rank=30.5/128
dinov2_b14 : norm=36.3149 eff_rank=23.0/128
siglip_b16 : norm=39.2079 eff_rank=29.0/128
clip_l14 × dinov2_b14 weight_cos=0.0046
clip_l14 × siglip_b16 weight_cos=-0.0049
dinov2_b14 × siglip_b16 weight_cos=-0.0055
=================================================================
SCAN 4: PATCHWORK COMPARTMENTS
=================================================================
Comp 0: 32 anchors
Comp 1: 32 anchors
Comp 2: 32 anchors
Comp 3: 32 anchors
Comp 4: 32 anchors
Comp 5: 32 anchors
Comp 6: 32 anchors
Comp 7: 32 anchors
Patchwork output: torch.Size([5000, 512])
norm: mean=11.6381 std=0.5046
comp 0: norm=2.8604 std_across_dims=0.0010
comp 1: norm=3.7652 std_across_dims=0.1596
comp 2: norm=2.3303 std_across_dims=0.0057
comp 3: norm=3.4802 std_across_dims=0.2053
comp 4: norm=3.3465 std_across_dims=0.1143
comp 5: norm=5.9651 std_across_dims=0.4720
comp 6: norm=6.0775 std_across_dims=0.4946
comp 7: norm=3.2188 std_across_dims=0.0718
=================================================================
SCAN 5: TRIANGULATION PATTERNS
=================================================================
Triangulation distances (1-cosine):
mean=0.8988 std=0.1301
min=0.0038 max=1.2538
Nearest anchor distance:
mean=0.0156 std=0.0042
max=0.0419 min=0.0038
Anchors within cos>0.5 per image:
mean=1.0 std=0.0
Top-10 nearest anchor distances:
k=0: mean=0.0156 std=0.0042
k=1: mean=0.6646 std=0.0218
k=2: mean=0.6806 std=0.0185
k=3: mean=0.6909 std=0.0173
k=4: mean=0.6977 std=0.0167
k=5: mean=0.7033 std=0.0162
k=6: mean=0.7081 std=0.0158
k=7: mean=0.7126 std=0.0154
k=8: mean=0.7166 std=0.0150
k=9: mean=0.7204 std=0.0147
=================================================================
SCAN 6: PER-CLASS ANCHOR AFFINITY
=================================================================
Top-3 anchors per class (first 20 classes):
person (n=2693): 65(2693/2693) 1(0/2693) 0(0/2693)
bicycle (n= 149): 65(149/149) 1(0/149) 0(0/149)
car (n= 535): 65(535/535) 1(0/535) 0(0/535)
motorcycle (n= 159): 65(159/159) 1(0/159) 0(0/159)
airplane (n= 97): 65(97/97) 1(0/97) 0(0/97)
bus (n= 189): 65(189/189) 1(0/189) 0(0/189)
train (n= 157): 65(157/157) 1(0/157) 0(0/157)
truck (n= 250): 65(250/250) 1(0/250) 0(0/250)
boat (n= 121): 65(121/121) 1(0/121) 0(0/121)
traffic light (n= 191): 65(191/191) 1(0/191) 0(0/191)
fire hydrant (n= 86): 65(86/86) 1(0/86) 0(0/86)
stop sign (n= 69): 65(69/69) 1(0/69) 0(0/69)
parking meter (n= 37): 65(37/37) 1(0/37) 0(0/37)
bench (n= 235): 65(235/235) 1(0/235) 0(0/235)
bird (n= 125): 65(125/125) 1(0/125) 0(0/125)
cat (n= 184): 65(184/184) 1(0/184) 0(0/184)
dog (n= 177): 65(177/177) 1(0/177) 0(0/177)
horse (n= 128): 65(128/128) 1(0/128) 0(0/128)
sheep (n= 65): 65(65/65) 1(0/65) 0(0/65)
cow (n= 87): 65(87/87) 1(0/87) 0(0/87)
Anchor specialization:
classes per anchor: mean=80.0 std=nan
max=80 min=80
=================================================================
SCAN 7: FUSED EMBEDDING GEOMETRY
=================================================================
Norms: mean=1.000000 std=0.000000
Self-sim (off-diag): 0.9693
Effective dim: 23.6/128
top-5 SVs explain 44.1%
top-10 SVs explain 70.2%
top-20 SVs explain 99.2%
top-50 SVs explain 100.0%
top-100 SVs explain 100.0%
Pentachoron CV: 0.3529
=================================================================
SCAN 8: EXPERT CONTRIBUTION
=================================================================
clip_l14 : cos_to_fused mean=0.9970 std=0.0016
dinov2_b14 : cos_to_fused mean=0.9957 std=0.0027
siglip_b16 : cos_to_fused mean=0.9955 std=0.0023
Without clip_l14 : cos_to_full=0.9992 (uniqueness=0.0008)
Without dinov2_b14 : cos_to_full=0.9989 (uniqueness=0.0011)
Without siglip_b16 : cos_to_full=0.9989 (uniqueness=0.0011)
Per-image expert disagreement:
Agreement: mean=0.9882 std=0.0055
Disagreement: mean=0.0041 std=0.0034
Most agreed image (1449): agreement=0.9978
labels: [22]
Most disagreed image (1435): agreement=0.9214
labels: [28]
=================================================================
ANALYSIS COMPLETE
=================================================================
/tmp/ipykernel_10600/3734699858.py:410: UserWarning: std(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /pytorch/aten/src/ATen/native/ReduceOps.cpp:1857.)
f"std={anchor_class_count[anchor_class_count>0].std():.1f}") |