---
license: apache-2.0
tags:
- audio
- speech
- foundation-model
- next-token-prediction
- isoflop
- research
---

# Discrete Audio IsoFLOP Model (discrete-audio-isoflop-1.8e19-173M-d512-L6-B128-b4e255) 

A suite of discrete audio models trained for our IsoFLOP study as part of **SODA**, which is a unified next-token prediction on interleaved semantic, acoustic, and text tokens.

🥤 **Project Page:** [https://soda-audio.github.io](https://soda-audio.github.io/)

For full usage instructions (e.g., inference code), and more information, please refer to the **[SODA-4B-base](https://huggingface.co/soda-research/soda-4b-base)** model card.

The details for this particular model is as follows:
- `compute_budget`: 1.8e19
- `param_count` (non-embedding): 173M
- `hidden_dim`: 512
- `num_layers`: 6
- `batch_size`: 128
- `training_step`: 45946
- `hash_key`: b4e255

📈 **WandB**: https://wandb.ai/potsawee/marin/groups/IsoFlop/workspace