gradio>=3.40 transformers>=4.36.0 torch torchaudio soundfile numpy sentencepiece huggingface-hub evaluate scipy