--- license: apache-2.0 tags: - audio - speech - foundation-model - next-token-prediction - isoflop - research --- # Discrete Audio IsoFLOP Model (discrete-audio-isoflop-1.8e19-173M-d512-L6-B128-b4e255) A suite of discrete audio models trained for our IsoFLOP study as part of **SODA**, which is a unified next-token prediction on interleaved semantic, acoustic, and text tokens. 🥤 **Project Page:** [https://soda-audio.github.io](https://soda-audio.github.io/) For full usage instructions (e.g., inference code), and more information, please refer to the **[SODA-4B-base](https://huggingface.co/soda-research/soda-4b-base)** model card. The details for this particular model is as follows: - `compute_budget`: 1.8e19 - `param_count` (non-embedding): 173M - `hidden_dim`: 512 - `num_layers`: 6 - `batch_size`: 128 - `training_step`: 45946 - `hash_key`: b4e255 📈 **WandB**: https://wandb.ai/potsawee/marin/groups/IsoFlop/workspace