4,384 results for “topic:tts”
Clone a voice in 5 seconds to generate arbitrary speech in real-time
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more. Features: Generate Text, MCP, Audio, Video, Images, Voice Cloning, Distributed, P2P and decentralized inference
A generative speech model for daily dialogue.
🚀Clone a voice in 5 seconds to generate arbitrary speech in real-time
Instant voice cloning by MIT and MyShell. Audio foundation model.
SOTA Open Source TTS
From the team behind Gatsby, Mastra is a framework for building AI-powered applications and agents with a modern TypeScript stack.
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
Readest is a modern, feature-rich ebook reader designed for avid readers offering seamless cross-platform access, powerful tools, and an intuitive interface to elevate your reading experience.
Generate audiobooks from e-books, voice cloning & 1158+ languages!
🌈一个跨平台的划词翻译和OCR软件 | A cross-platform software for text translation and recognition.
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
A fast, local neural text to speech system
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Video translation and dubbing tool powered by LLMs. The video translator offers 100 language translations and one-click full-process deployment. The video translation output is optimized for platforms like YouTube,TikTok. AI视频翻译配音工具,100种语言双向翻译,一键部署全流程,可以生抖音,小红书,哔哩哔哩,视频号,TikTok,Youtube等形态的内容成适配
阅读书源-香色闺阁+用心读书+源阅+阅读3.0书源+源阅读+爱阅书香+千阅+花火阅读+读不舍手+番茄+喜马拉雅+漫画+听书+书源+IPTV源+IPA巨魔应用=自动更新
A sound cloning tool with a web interface, using your voice or any sound to record audio / 一个带web界面的声音克隆工具,使用你的音色或任意声音来录制音频
vits2 backbone with multilingual-bert
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to synthesize text into speech, along with support for external API interfaces.
Open Vision Agents by Stream. Build Vision Agents quickly with any model or video provider. Uses Stream's edge network for ultra-low latency.
High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.