--- license: mit tags: - soft-equivariance - equivariant - vit base_model: - facebook/dinov2-base --- # Tunable Soft Equivariance with Guarantees **Paper**: [Tunable Soft Equivariance with Guarantees](https://arxiv.org/abs/2603.26657) **Authors**: Md Ashiqur Rahman, Lim Jun Hao, Jeremiah Jiang, Teck-Yian Lim, Raymond A. Yeh --- ## Overview This repository hosts soft-equivariant vision models introduced in our paper. This repository contains a soft-equivariant VIT fine-tuned on PASCAL VOC. It uses a linear segmentation head. --- ## Usage > **Note**: All models require `trust_remote_code=True` because they use custom model classes. ### Semantic Segmentation (ViT backbone) ```python from transformers import AutoModel, AutoConfig import torch import torch.nn.functional as F model_id = "ashiq24/softeq-vit-base-patch16-224-voc-seg-c4-s0.9-sp0.9" config = AutoConfig.from_pretrained(model_id, trust_remote_code=True) model = AutoModel.from_pretrained(model_id, trust_remote_code=True) model.eval() # Input size must match model training resolution (e.g., 224×224) pixel_values = torch.randn(1, 3, 224, 224) with torch.no_grad(): outputs = model(pixel_values=pixel_values) # outputs.logits shape: (1, num_labels, H, W) — already upsampled to input resolution seg_map = outputs.logits.argmax(dim=1) # (1, H, W) predicted label per pixel ``` --- ## Configuration Parameters The `SoftEqConfig` class stores all architectural parameters. Key fields: | Parameter | Type | Description | |---|---|---| | `n_rotations` | `int` | Size of the discrete rotation group (e.g., `4` for C4, `720` for near-continuous) | | `soft_thresholding` | `float` | Softness of the patch-embedding filter in `[0, 1]`; `0` = strict equivariance, `1` = no filter | | `soft_thresholding_pos` | `float` | Softness of the positional-embedding filter in `[0, 1]` | | `group_type` | `str` | Symmetry group: `"rotation"` or `"roto_reflection"` | | `hard_mask` | `bool` | Use a hard (step-function) mask instead of exponential damping | | `model_arch` | `str` | Architecture variant (see table above) | | `pretrained_model` | `str` | HuggingFace identifier of the base backbone | | `num_labels` | `int` | Number of output classes | --- ## Citation If you use these models in your research, please cite: ```bibtex @article{rahman2026tunable, title={Tunable Soft Equivariance with Guarantees}, author={Rahman, Md Ashiqur and Hao, Lim Jun and Jiang, Jeremiah and Lim, Teck-Yian and Yeh, Raymond A}, journal={arXiv preprint arXiv:2603.26657}, year={2026} } ```