Coqui - Advanced Text-to-Speech Library
Coqui TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. Coqui TTS comes with pretrained models, tools for measuring dataset quality and already used in 20+ languages for products and research projects.
Features:
Features:
- High-performance Deep Learning models for Text2Speech tasks.
- Text2Spec models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech).
- Speaker Encoder to compute speaker embeddings efficiently.
- Vocoder models (MelGAN, Multiband-MelGAN, GAN-TTS, ParallelWaveGAN, WaveGrad, WaveRNN)
- Fast and efficient model training.
- Detailed training logs on the terminal and Tensorboard.
- Support for Multi-speaker TTS.
- Efficient, flexible, lightweight but feature complete Trainer API.
- Ability to convert PyTorch models to Tensorflow 2.0 and TFLite for inference.
- Released and read-to-use models.
- Tools to curate Text2Speech datasets underdataset_analysis.
- Utilities to use and test your models.
- Modular (but not too much) code base enabling easy implementation of new ideas.
https://coqui.ai/
https://github.com/coqui-ai/TTS/
License:
Tech:
Tags: