---
license: mit
tags:
- soft-equivariance
- equivariant
- vit
base_model:
- facebook/dinov2-base
---
# Tunable Soft Equivariance with Guarantees

**Paper**: [Tunable Soft Equivariance with Guarantees](https://arxiv.org/abs/2603.26657)  
**Authors**: Md Ashiqur Rahman, Lim Jun Hao, Jeremiah Jiang, Teck-Yian Lim, Raymond A. Yeh

---

## Overview

This repository hosts soft-equivariant vision models introduced in our paper. 
This repository contains a soft-equivariant VIT fine-tuned on PASCAL VOC. It uses a linear segmentation head.

---


## Usage

> **Note**: All models require `trust_remote_code=True` because they use custom model classes.


### Semantic Segmentation (ViT backbone)

```python
from transformers import AutoModel, AutoConfig
import torch
import torch.nn.functional as F

model_id = "ashiq24/softeq-vit-base-patch16-224-voc-seg-c4-s0.9-sp0.9"

config = AutoConfig.from_pretrained(model_id, trust_remote_code=True)
model  = AutoModel.from_pretrained(model_id, trust_remote_code=True)
model.eval()

# Input size must match model training resolution (e.g., 224×224)
pixel_values = torch.randn(1, 3, 224, 224)

with torch.no_grad():
    outputs = model(pixel_values=pixel_values)

# outputs.logits shape: (1, num_labels, H, W) — already upsampled to input resolution
seg_map = outputs.logits.argmax(dim=1)   # (1, H, W) predicted label per pixel
```


---

## Configuration Parameters

The `SoftEqConfig` class stores all architectural parameters. Key fields:

| Parameter | Type | Description |
|---|---|---|
| `n_rotations` | `int` | Size of the discrete rotation group (e.g., `4` for C4, `720` for near-continuous) |
| `soft_thresholding` | `float` | Softness of the patch-embedding filter in `[0, 1]`; `0` = strict equivariance, `1` = no filter |
| `soft_thresholding_pos` | `float` | Softness of the positional-embedding filter in `[0, 1]` |
| `group_type` | `str` | Symmetry group: `"rotation"` or `"roto_reflection"` |
| `hard_mask` | `bool` | Use a hard (step-function) mask instead of exponential damping |
| `model_arch` | `str` | Architecture variant (see table above) |
| `pretrained_model` | `str` | HuggingFace identifier of the base backbone |
| `num_labels` | `int` | Number of output classes |

---

## Citation

If you use these models in your research, please cite:

```bibtex
@article{rahman2026tunable,
  title={Tunable Soft Equivariance with Guarantees},
  author={Rahman, Md Ashiqur and Hao, Lim Jun and Jiang, Jeremiah and Lim, Teck-Yian and Yeh, Raymond A},
  journal={arXiv preprint arXiv:2603.26657},
  year={2026}
}
```