---
license: other
license_name: desert-ant-labs-source-available-1.0
license_link: LICENSE.md
language:
  - en
tags:
  - audio
  - speech-enhancement
  - denoising
  - dereverberation
  - on-device
  - core-ml
  - onnx
pipeline_tag: audio-to-audio
---

# Clear: on-device speech enhancement

48 kHz on-device speech enhancement. Takes noisy mono or stereo audio (phone mic, untreated room, traffic), returns a podcast-ready file: denoised, dereverbed, voice warm and present.

## Try it

- **Browser demo:** [`huggingface.co/spaces/desert-ant-labs/clear-demo`](https://huggingface.co/spaces/desert-ant-labs/clear-demo). Twelve real recordings, raw vs cleaned, sample-aligned A/B.
- **Drop-in SDKs** for iOS, Android, and Web are coming soon. Email <contact@desertant.ai> for early access.

## Variants

| Variant | Character | When to use |
|---|---|---|
| **`clear-studio`** | Quiet, studio-like; silences near zero | Default. Works across the full range of input quality: phone audio, laptop mic, untreated rooms, USB / XLR podcast captures. |
| **`clear-natural`** | Room tone, breath, lip texture preserved | Treated podcast studios, USB / XLR captures, voiceover where the original sound is intentional. |

If the source is already clean and you want the model to stay invisible, pick `clear-natural`. Otherwise `clear-studio` is the default.

## Files

Both variants share the same architecture and realtime cost; only the weights differ.

| Variant | File | Format | Size |
|---|---|---|---:|
| `clear-studio` | `clear-studio.mlpackage.zip` | Core ML mlpackage (fp16) | ~3.8 MB |
| `clear-studio` | `clear-studio.mlmodelc.zip` | Core ML mlmodelc (fp16, precompiled) | ~3.8 MB |
| `clear-studio` | `clear-studio.onnx` | ONNX (fp32) | ~8.5 MB |
| `clear-natural` | `clear-natural.mlpackage.zip` | Core ML mlpackage (fp16) | ~3.8 MB |
| `clear-natural` | `clear-natural.mlmodelc.zip` | Core ML mlmodelc (fp16, precompiled) | ~3.8 MB |
| `clear-natural` | `clear-natural.onnx` | ONNX (fp32) | ~8.5 MB |

## Use

### ONNX

```python
from huggingface_hub import hf_hub_download
import onnxruntime as ort

path    = hf_hub_download("desert-ant-labs/clear", "clear-studio.onnx")
session = ort.InferenceSession(path, providers=["CPUExecutionProvider"])
```

## Inputs and outputs

- **Architecture:** DeepFilterNet 3 (DFN3-half).
- **Sample rate:** 48 kHz, mono or stereo (per-channel inference).
- **Inference contract:** `spec` / `feat_erb` / `feat_spec` → `spec_enhanced`. STFT, ERB, and ISTFT are host-side DSP, not part of the model graph.

## Performance

Both variants run at the same speed. Enhancing a 5-minute clip on the Apple Neural Engine:

| Device | Chip | Mono | Stereo |
|---|---|---:|---:|
| iPhone 15 Pro | A17 Pro | 4.88 s (61× realtime) | 6.53 s (46×) |
| iPhone 17 Pro | A19 Pro | 3.70 s (81× realtime) | 5.16 s (58×) |

Cold model load is ~0.6 s; later loads ~100 ms via the system ANE cache.

## Limitations

- Trained on English speech; non-English speech still benefits but has not been measured against per-language ground truth.
- Heavy background music or multi-speaker overlap degrades quality.
- Mastering is informational only; verify against the platform's actual loudness target before publishing.

## Built on

- [DeepFilterNet 3](https://github.com/Rikorose/DeepFilterNet) by Rikorose, MIT. Fine-tuned on the Desert Ant Labs speech corpus.

## License

Released under the **Desert Ant Labs Source-Available License v1.0** (see [`LICENSE.md`](LICENSE.md)).

- **Free for commercial use up to 100,000 Monthly Active Users (MAU).**
- Above 100,000 MAU a commercial license is required. Contact <licensing@desertant.ai>.

## Citation

```bibtex
@software{clear_2026,
  title  = {Clear: on-device speech enhancement},
  author = {Desert Ant Labs},
  year   = {2026},
  url    = {https://huggingface.co/desert-ant-labs/clear},
}
```

---

© 2026 Desert Ant Labs · <https://desertant.ai>