CrackSeg
Fine-tuned CLIPSeg for pixel-wise surface crack detection. Given an image of any surface, the model returns a binary segmentation mask highlighting crack regions.
Model Performance
| Metric | Score |
|---|---|
| Dice Score | 0.612 |
| mIoU | 0.716 |
Live Demo
Try it on HuggingFace Spaces.
Training Details
- Dataset: 14,000+ crack images (Roboflow, COCO format)
- Fine-tuning: Partial — decoder fully unfrozen + last 2 layers of CLIP vision encoder + last 1 layer of CLIP text encoder
- Loss: Focal Loss (α=0.75, γ=2.0)
- Optimizer: AdamW with differential learning rates
- Scheduler: CosineAnnealingLR
- Early stopping: patience = 5
Usage
import torch
from huggingface_hub import hf_hub_download
from transformers import AutoProcessor, CLIPSegForImageSegmentation
from PIL import Image
processor = AutoProcessor.from_pretrained("CIDAS/clipseg-rd64-refined")
model = CLIPSegForImageSegmentation.from_pretrained("CIDAS/clipseg-rd64-refined")
path = hf_hub_download(repo_id="primus29/crackseg", filename="best_model.pth")
checkpoint = torch.load(path, map_location="cpu", weights_only=False)
model.load_state_dict(checkpoint['model_state_dict'])
model.eval()
image = Image.open("your_image.jpg")
inputs = processor(text="segment crack", images=image, return_tensors="pt", padding=True)
with torch.no_grad():
outputs = model(**inputs)
mask = torch.sigmoid(outputs.logits).squeeze()
mask = (mask > 0.5).float()
Limitations
- Shadow regions can be misidentified as cracks
- Performance degrades on very thin hairline cracks
- Trained primarily on surface/concrete crack data; may not generalize to all materials
Model tree for primus29/crackseg
Base model
CIDAS/clipseg-rd64-refined