1aurent/ADE20K
Viewer • Updated • 27.6k • 4.18k • 16
Paper: Tunable Soft Equivariance with Guarantees
Authors: Md Ashiqur Rahman, Lim Jun Hao, Jeremiah Jiang, Teck-Yian Lim, Raymond A. Yeh
This repository hosts soft-equivariant vision models introduced in our paper. This repository contains a soft-equivariant VIT fine-tuned on ADE20k. It uses a linear segmentation head.
Note: All models require
trust_remote_code=Truebecause they use custom model classes.
from transformers import AutoModel, AutoConfig
import torch
import torch.nn.functional as F
model_id = "ashiq24/softeq-dinov2-base-ade20k-seg-c4-s0.8-sp1.0"
config = AutoConfig.from_pretrained(model_id, trust_remote_code=True)
model = AutoModel.from_pretrained(model_id, trust_remote_code=True)
model.eval()
# Input size must match model training resolution (e.g., 512×512)
pixel_values = torch.randn(1, 3, 512, 512)
with torch.no_grad():
outputs = model(pixel_values=pixel_values)
# outputs.logits shape: (1, num_labels, H, W) — already upsampled to input resolution
seg_map = outputs.logits.argmax(dim=1) # (1, H, W) predicted label per pixel
The SoftEqConfig class stores all architectural parameters. Key fields:
| Parameter | Type | Description |
|---|---|---|
n_rotations |
int |
Size of the discrete rotation group (e.g., 4 for C4, 720 for near-continuous) |
soft_thresholding |
float |
Softness of the patch-embedding filter in [0, 1]; 0 = strict equivariance, 1 = no filter |
soft_thresholding_pos |
float |
Softness of the positional-embedding filter in [0, 1] |
group_type |
str |
Symmetry group: "rotation" or "roto_reflection" |
hard_mask |
bool |
Use a hard (step-function) mask instead of exponential damping |
model_arch |
str |
Architecture variant (see table above) |
pretrained_model |
str |
HuggingFace identifier of the base backbone |
num_labels |
int |
Number of output classes |
If you use these models in your research, please cite:
@article{rahman2026tunable,
title={Tunable Soft Equivariance with Guarantees},
author={Rahman, Md Ashiqur and Hao, Lim Jun and Jiang, Jeremiah and Lim, Teck-Yian and Yeh, Raymond A},
journal={arXiv preprint arXiv:2603.26657},
year={2026}
}
Base model
facebook/dinov2-base