# Plan for Implementing High-Performance Batch Processing This document outlines the necessary code modifications to implement a high-performance batch processing mode that can be toggled by the "Use VADER" checkbox in the GUI. The goal is to create two distinct modes: - **VADER On (Nuanced Mode):** Slower, processes chunks one-by-one with unique TTS parameters for nuanced delivery. - **VADER Off (Batch Mode):** Significantly faster, processes chunks in batches with a single set of TTS parameters. --- ## 1. File to Modify: `src/chatterbox/tts.py` * **Purpose:** To enable the core TTS model to handle batches of text. * **Changes Needed:** * A new method, `generate_batch(self, texts: list, **tts_params)`, needs to be created within the `ChatterboxTTS` class. * This method must perform the following steps: 1. Accept a list of text strings (`texts`). 2. Tokenize each text string in the list. 3. Pad the tokenized sequences to ensure they all have the same length, creating a single batch tensor. `torch.nn.utils.rnn.pad_sequence` is suitable for this. 4. Feed the complete batch tensor to the underlying model (`self.t3.inference` and `self.s3gen.inference`). 5. Return a list of the resulting audio waveforms. --- ## 2. File to Modify: `modules/tts_engine.py` * **Purpose:** To orchestrate the new batching workflow and choose the processing mode. * **Changes Needed:** ### a. Create a New Worker Function * Add a new function: `process_batch(batch_of_chunks, model, ...)` * This function will: 1. Accept a list of chunk objects (e.g., a batch of 16). 2. Extract the text from each chunk into a simple list. 3. Call the new `model.generate_batch()` with the list of texts and the shared TTS parameters. 4. Receive a list of audio waveforms back. 5. Loop through the audio waves, apply the existing silence trimming and padding logic to each one, and save them to their respective `chunk_...wav` files. ### b. Modify the Main `process_book_folder` Function * Locate the `use_vader` flag which is determined from the GUI options. * Wrap the core processing loop in an `if/else` block based on this flag. * **`if use_vader:` (Nuanced Mode):** * Keep the existing code that iterates through chunks one-by-one and submits them to the `process_one_chunk` function. * **`else:` (Batch Mode):** * Add the new logic here. * Group the `all_chunks` list into fixed-size batches based on `TTS_BATCH_SIZE` from the config. * Use the existing `ThreadPoolExecutor` to submit these new **batches** to the new `process_batch` worker function. --- ## 3. Files to Modify: `config/config.py` and `chatterbox_gui.py` * **Purpose:** To provide user control over the batch size for performance tuning. * **Changes Needed:** * **In `config/config.py`:** * Add a new configuration variable: `TTS_BATCH_SIZE = 16` (or another sensible default). * **In `chatterbox_gui.py`:** * On the "Config" tab, add a new `QSpinBox` (numeric input field) that is linked to the `TTS_BATCH_SIZE` variable. This will allow the user to change the batch size without editing code.