项目描述
IndexTTS is an industrial-level controllable and efficient zero-shot text-to-speech system. It features a GPT-style model based on XTTS and Tortoise, capable of correcting Chinese pronunciation using pinyin and controlling pauses through punctuation. Enhanced with speaker condition feature representation and BigVGAN2 for optimized audio quality, IndexTTS achieves state-of-the-art performance, surpassing popular TTS systems.
项目信息
创建于 2/6/2025
更新于 7/2/2025
分类
speech-technology
ai-content-generation
machine-learning-framework
标签
ready-to-use
enterprise-application
model-deployment
data-processing
chinese-support
主题
bigvgan
cross-lingual
indextts
text-to-speech
tts
voice-clone
zero-shot-tts