FishingROV — YOLO26x L/R 1280 (augmented) — King scallop teacher

Zoo ID: det-scallop_yolo26x_lr_1280_aug · canonical weights: best.pt (training epoch 50)

High-capacity teacher detector for King scallops, trained on left/right split panels of 1080p survey frames upscaled to 1280 px.

FishingROV mirrors the same detector → crop → classifier pattern on two tiers with different models. On the GPU server (RTX 3090) this teacher generates regions of interest and feeds the cropped detections to a SwinV2 classifier. The on-device Aura tier runs the lighter scout detector with a MobileNetV2 classifier. This model is the 3090-side detector. The full pipeline is still to be validated.

Metrics (honest, station-disjoint held-out)

Re-validated with model.val(imgsz=1280, conf=0.001, iou=0.6) on the public Zenodo Test files stations — locations never seen during training.

Metric	Value
mAP50	0.705
mAP50-95	0.443
Precision	0.737
Recall	0.637
Peak single-epoch mAP50	0.712

On data integrity. Validation panels are the public Zenodo Test files stations (station-disjoint from training) and are byte-identical to the non-augmented teacher's val set — only the training set was augmented. The reported numbers are therefore honest held-out metrics, not an inflated random-frame split.

Model details


Architecture	YOLO26x
Input size	1280 px (left/right split panels)
Classes	1 (scallop)
Train panels	17241 (augmented)
Val panels	1376
Source dataset	`DS-LR1280-v1-aug`

Best honest L/R teacher in the FishingROV zoo. Augmentation added ~+0.05 mAP50 over the non-augmented baseline (scallop_yolo26x_lr_1280, mAP50 0.657) on the same held-out stations.

SwinV2 classifier metrics (same-crop eval)

The 3090-tier classifier paired with this detector is SwinV2-B (256). It was trained on DS-CLS224 (classifier_data) and evaluated on its station-disjoint val split derived from Zenodo Test files (no random frame mixing). Crops are square, centered on human boxes, padded if needed, then resized to 224px; negatives are sampled away from GT boxes.

Metric	Value
Macro precision	0.700
Macro recall	0.654
Macro F1	0.661
Accuracy	0.966

Per-class metrics (from class_eval_best.json):

Class	Precision	Recall	F1	Support
dead	0.464	0.642	0.539	81
king	0.391	0.237	0.295	76
not_a_scallop	0.991	0.996	0.993	5781
queen	0.818	0.899	0.857	296
recessed	0.837	0.497	0.623	145

Intended use & limitations

The 3090-side detector: it generates regions of interest and feeds the cropped detections to a SwinV2 classifier. The same detector → classifier pattern is mirrored on the on-device Aura tier with a lighter scout detector and a MobileNetV2 classifier (different models).
Also usable as an offline pseudo-labelling / auto-annotation teacher to bootstrap training data. Not a final stock-assessment instrument.
The full pipeline is still to be validated.
Trained only on the public St Andrews survey distribution; performance on other gear, lighting, or substrate is unverified.
Partially buried and king-scallop instances remain the hardest cases.

Files

best.pt — canonical weights (fitness-best epoch 50).
last.pt — final-epoch weights.
results.csv, results.png, curves — training history and PR/F1 curves.

Attribution & License

This model is a derivative work based on the University of St Andrews King Scallop dataset.

Original DOI: 10.5281/zenodo.10156830

In accordance with the original dataset's terms, this derivative work is released under the Creative Commons Attribution 4.0 International (CC-BY 4.0) license. You are free to share and adapt this material, provided you give appropriate credit to the original authors and indicate if changes were made.

Downloads last month: 128