E2/F5 TTS
This is an online demo for F5-TTS with advanced batch processing support. This app supports the following TTS models:
- F5-TTS (A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching)
- E2 TTS (Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTS)
The checkpoints currently support English and Chinese.
If you're having issues, try converting your reference audio to WAV or MP3, clipping it to 12s with ✂ in the bottom right corner (otherwise might have non-optimal auto-trimmed result).
NOTE: Reference text will be automatically transcribed with Whisper if not provided. For best results, keep your reference clips short (<12s). Ensure the audio is fully uploaded before generating.
Batched TTS
Check to use a random seed for each generation. Uncheck to use the seed specified.
If undesired long silence(s) produced, turn on to automatically detect and crop.
0.3 2
4 64
0 1