NULA - Base

v0.1.0 · Anti-Aliased Residual CNN · CIFAR-10 · Adversarial Robustness

License: MIT HuggingFace

perturbation examples


From left to right: Clean (32x32), Bilinear Resize (8x8), Hard Decimation (stride-2),

and Checkerboard Aliasing (high-frequency injection).

NULA, an anti-aliased residual convolutional neural network for CIFAR-10 image classification, trained to be robust against perturbations that exploit downsampling operations.

Classical image models rely on fragile high-frequency cues, which downsampling operators destroy or alias exactly to those components.

NULA is trained to reduce this dependence and instead form representations that remain stable under information-destroying transformations such as resizing, decimations, and aliasing-style perturbation.

NULA explicitly targets robustness to operators that change the sampling structure of the input, rather than generic adversarial perturbations.

Problem

Downsampling operations are linear maps that perform a many-to-one mapping. Information is destroyed, and the null space explodes:

any such map D : ℝⁿ → ℝᵐ with m < n has a non-trivial null.

  • dim(ker(D)) = n - rank(D) ≥ n - m > 0

This is exploitable.

Bilinear Resize

resize x.25

Bilinear resize computes a weighted average over local neighborhoods. Any input signal decomposes as:

  • x = x_low + x_high

where x_low contains the low-frequency components preserved by the operator, and x_high contains the high-frequency residual.

Under downsampling D and upsampling U:

  • D(x) ≈ x_low

  • U(D(x)) ≈ x_low

  • x_high ∈ ker(D)

x_high -> completely annihilated. An attacker can inject arbitrary signal into x_high, perturbing the image without affecting the downsampled representation at all.

Decimation

resize x.25

Decimation keeps every k-th pixel and discards the rest:

  • x_dec[i, j] = x[ki, kj]

No smoothing is applied before subsampling.

By the Nyquist-Shannon theorem, signals with frequency content above 1/(2k) of the sampling rate are aliased. they fold back into lower frequencies and corrupt the representation.

Checkerboard Attack

nula adversarial analysis

Visualization of the Clean Input (x), the Perturbed Input (x + δ) at ε=0.05, and the Amplified Perturbation Signal (δ).

  • δ[i, j] = ε · (-1)^(i+j)

This is a Nyquist injection. It is the highest spatial frequency representable on a discrete grid. Under stride-2 subsampling S_2:

  • (S_2 δ)[i, j] = δ[2i, 2j] = ε · (-1)^(2i+2j) = ε · 1 = ε

The subsampled result is a constant.

the checkerboard pattern is completely collapsed to a DC offset and loses all adversarial structure.

This proves:

  • δ ∈ ker(S₂ - εI)

the perturbation lies in the null space of the centered stride-2 operator. Modern convolutional networks would never even see the attack.

The result is an image perceptually identical to the original, with a manipulated activation pattern upstream of the first downsampling operation.

let f = g ∘ D where g is the remainder of the network.

  • for any δ ∈ ker(D):

    • f(x + δ) = g(D(x + δ)) = g(D(x)) = f(x)

The network is blind to the entire subspace ker(D).

Approach

Anti-aliased Downsampling (BlurPool)

Standard strided convolutions perform subsampling without enforcing a band-limit, causing aliasing.

BlurPool2d introduces a low-pass filter before subsampling:

  • x → (low-pass filter) → subsample

The filter is a normalized fixed binomial kernel [1, 2, 1] ⊗ [1, 2, 1], applied depthwise: one filter per channel with no cross-channel mixing.

This enforces approximate band-limitedness prior to resolution reduction. It reduces aliasing artifacts and makes feature extraction more stable under downsampling.

Squeeze-and-Excitation (SE) blocks

SE blocks perform channel-wise reweighting.

  • s = σ(W₂ δ(W₁ GAP(x)))

  • x → s ⊙ x

The bottleneck dimension is max(C // r, 1) where r = 16, keeping the recalibration lightweight relative to the feature dimension.

The network learns to suppress channels that carry unstable high-frequency information and amplify channels that carry structurally stable features.

FIRST EVALUATION (Base)

The first evaluation was trained for clean classification only, without adversarial augmentation.

Perturbation Accuracy Δ from base
No Perturbation 91.95%
Resize x0.5 (bilinear) 59.83% −32.12%
Resize x0.25 (bilinear) 24.82% −67.13%
Decimate x2 30.03% −61.92%
Checkerboard ε = 0.03 75.47% −16.48%
Checkerboard ε = 0.05 44.99% −46.96%

This is a catastrophic collapse of model performance under representation instability. The representation inside the network is highly sensitive to aliasing and not stable under non-invertible transformations.

  • for checkerboard perturbations: ∃δ s.t. ||δ||∞ ≤ ε, but argmax f(x + δ) ≠ argmax f(x)

  • for resize/decimate: the transformation preserves class identity while destroying high-frequency structure.

the model fails because its representations are not invariant to the loss of this structure.

SECOND EVALUATION (Robust)

The second evaluation of NULA was retrained from scratch under a modified training distribution.

During training, images were stochastically exposed to resolution-degrading transformations such as:

  • resize-down/up
  • hard decimation
  • anti-aliased blur-decimation

The training objective becomes:

  • min_θ E_{ (x, y) ~ D, T ~ 𝓣 } [ L(f_θ(T(x)), y) ]

where T is a distribution over resolution-degrading operators

The model is forced to learn representations that remain predictive under transformations that destroy or corrupt high-frequency information.

Perturbation Accuracy Δ from base
No Perturbation 89.42% −2.53%
Resize x0.5 85.37% +25.54%
Resize x0.25 71.80% +46.98%
Decimate x2 85.02% +54.99%
Checkerboard ε = 0.03 89.43% +13.96%
Checkerboard ε = 0.05 89.39% +44.40%

Interpretation

The baseline model relies on high-frequency components that are not stable under downsampling or aliasing. These components lie in regions of the input space that are not preserved by common sampling operators.

As a result, small perturbations aligned with these unstable directions cause large changes in the model’s activations.

The robust variant shifts reliance toward features that occupy the range of the downsampling operator — the subspace that survives projection.

These features encode structural information at scales that are preserved under frequency loss, rather than fine-grained detail that is annihilated by the null space.

The result is a model whose decision boundary is anchored to geometry that persists through information destruction.

Usage

NULA is hosted on the HuggingFace Hub and can be loaded directly via the transformers library.

from transformers import pipeline

classifier = pipeline(
    "image-classification", 
    model="MamaPearl/nula-cifar10-robust-v0", 
    trust_remote_code=True
)

# Run on any image URL or local path
results = classifier("https://path-to-your-image.jpg")

for res in results:
    print(f"{res['label']}: {res['score']:.2%}")

For manual loading, see infer.py

Input tensors should be shape (B, C, H, W).

Preprocessing Requirements:

  • Color Mode: RGB (images should be converted via .convert("RGB"))

  • Input Size: 32x32 (standard for CIFAR-10)

  • Normalization: Mean [0.5, 0.5, 0.5] and std [0.5, 0.5, 0.5] (scales pixels from [0, 1] to [-1, 1])

NULA Architecture

Component Details
Stem 3 → 128, Conv3×3, BatchNorm, SiLU
Stage 1 128 → 128, residual, no downsample
Stage 2 128 → 256, residual, BlurPool downsample
Stage 3 256 → 512, residual, BlurPool downsample
Head GlobalAvgPool → Linear(512, 512) → SiLU → Dropout(0.3) → Linear(512, 10)

SE blocks applied at each stage with reduction factor 16.

Citation

If you use this model or repository in your research, please cite:

@misc{mamapearl_nula_2026,
  author       = {MamaPearl},
  title        = {NULA: Robust CIFAR-10 Classification via Anti-Aliased Downsampling and Adversarial Augmentation},
  month        = apr,
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/MamaPearl/nula-cifar10-robust-v0}},
}

Authors

MamaPearl · @mamapearli

License

This project is licensed under the MIT License. See LICENSE for more information.

Downloads last month
30
Safetensors
Model size
5.22M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MamaPearl/nula-cifar10-robust-v0

Finetunes
1 model

Dataset used to train MamaPearl/nula-cifar10-robust-v0