--- license: mit tags: - soft-equivariance - equivariant - vit base_model: - facebook/dinov2-base datasets: - 1aurent/ADE20K --- # Tunable Soft Equivariance with Guarantees **Paper**: [Tunable Soft Equivariance with Guarantees](https://arxiv.org/abs/2603.26657) **Authors**: Md Ashiqur Rahman, Lim Jun Hao, Jeremiah Jiang, Teck-Yian Lim, Raymond A. Yeh --- ## Overview This repository hosts soft-equivariant vision models introduced in our paper. This repository contains a soft-equivariant VIT fine-tuned on ADE20k. It uses a linear segmentation head. --- ## Usage > **Note**: All models require `trust_remote_code=True` because they use custom model classes. ### Semantic Segmentation (ViT backbone) ```python from transformers import AutoModel, AutoConfig import torch import torch.nn.functional as F model_id = "ashiq24/softeq-dinov2-base-ade20k-seg-c4-s0.8-sp1.0" config = AutoConfig.from_pretrained(model_id, trust_remote_code=True) model = AutoModel.from_pretrained(model_id, trust_remote_code=True) model.eval() # Input size must match model training resolution (e.g., 512×512) pixel_values = torch.randn(1, 3, 512, 512) with torch.no_grad(): outputs = model(pixel_values=pixel_values) # outputs.logits shape: (1, num_labels, H, W) — already upsampled to input resolution seg_map = outputs.logits.argmax(dim=1) # (1, H, W) predicted label per pixel ``` --- ## Configuration Parameters The `SoftEqConfig` class stores all architectural parameters. Key fields: | Parameter | Type | Description | |---|---|---| | `n_rotations` | `int` | Size of the discrete rotation group (e.g., `4` for C4, `720` for near-continuous) | | `soft_thresholding` | `float` | Softness of the patch-embedding filter in `[0, 1]`; `0` = strict equivariance, `1` = no filter | | `soft_thresholding_pos` | `float` | Softness of the positional-embedding filter in `[0, 1]` | | `group_type` | `str` | Symmetry group: `"rotation"` or `"roto_reflection"` | | `hard_mask` | `bool` | Use a hard (step-function) mask instead of exponential damping | | `model_arch` | `str` | Architecture variant (see table above) | | `pretrained_model` | `str` | HuggingFace identifier of the base backbone | | `num_labels` | `int` | Number of output classes | --- ## Citation If you use these models in your research, please cite: ```bibtex @article{rahman2026tunable, title={Tunable Soft Equivariance with Guarantees}, author={Rahman, Md Ashiqur and Hao, Lim Jun and Jiang, Jeremiah and Lim, Teck-Yian and Yeh, Raymond A}, journal={arXiv preprint arXiv:2603.26657}, year={2026} } ```