"topic:voice-cloning" — Search

Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音，一键全自动视频搬运AI字幕组

Python16.1k1.7kUpdated 2 hours ago

ai-translationdubbinglocalizationvideo-translationvoice-cloning

PaddlePaddle/PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Python12.5k2.0kUpdated 2 hours ago

asrcode-switchconformerkwspunctuation-restorationself-supervised-learningsound-classificationspeech-alignmentspeech-recognitionspeech-synthesisspeech-translationstreaming-asrstreaming-ttstransformerttsvocodervoice-cloningvoice-recognitionwav2vec2whisper

abus-aikorea/voice-pro

Gradio WebUI for creators and developers, featuring key TTS (Edge-TTS, kokoro) and zero-shot Voice Cloning (E2 & F5-TTS, CosyVoice), with Whisper audio processing, YouTube download, Demucs vocal isolation, and multilingual translation.

Python6.4k689Updated 3 hours ago

audiobookfaster-whispergradiokaraokepodcastsspeech-recognitionspeech-synthesisspeech-to-textsubtitlestext-to-speechtranscriptiontranslatorttsvoice-cloningvoice-conversionwebuiwhisperwhisperxyt-dlp

OpenBMB/VoxCPM

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

Python6.1k735Updated just now

audiodeeplearningminicpmpythonpytorchspeechspeech-synthesistext-to-speechttstts-modelvoice-cloning

multimodal-art-projection/YuE

YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open

Python6.1k716Updated 4 hours ago

aiaudio-generationdeep-learningfoundation-modelsgpthuggingfacellamallmsmusic-generationstyle-transfersvoice-cloning

IAHispano/Applio

A simple, high-quality voice conversion tool focused on ease of use and performance.

Python3.0k501Updated 4 hours ago

aiappliopytorchrvcspeechspeech-to-speechtext-to-speechttsvcvitsvoicevoice-clonevoice-cloningvoice-conversion

Camb-ai/MARS5-TTS

MARS5 speech model (TTS) from CAMB.AI

Jupyter Notebook2.8k246Updated 3 days ago

prosodyspeechspeech-synthesistext-to-speechvoice-cloneaivoice-cloning

voice-cloning-app/Voice-Cloning-App

A Python/Pytorch app for easily synthesising human voices

Python1.4k238Updated 1 week ago

deep-learningpythonpytorchtacotron2text-to-speechttsvoice-cloning

High-Logic/Genie-TTS

GPT-SoVITS ONNX Inference Engine & Model Converter

Python1.4k93Updated 2 hours ago

gpt-sovitstext-to-speechttsvitsvoice-clonevoice-cloning

coqui-ai/open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

1.4k150Updated 3 days ago

speech-emotion-recognitionspeech-processingspeech-recognitionspeech-separationspeech-synthesisspeech-to-textstttext-to-speechttsvoice-activity-detectionvoice-cloningvoice-recognition

Enemyx-net/VibeVoice-ComfyUI

A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling high-quality single and multi-speaker voice synthesis directly within your ComfyUI workflows.

Python1.4k217Updated 14 hours ago

ai-audioai-ttsai-voiceai-voice-cloneai-voice-cloniningcomfyui-custom-nodecomfyui-custom-nodes-text-to-speechcomfyui-nodest2stext-to-speechttsvibevoicevibevoice-microsoftvoice-cloningvoice-generationvoice-generator

MiniMax-AI/MiniMax-MCP

Official MiniMax Model Context Protocol (MCP) server that enables interaction with powerful Text to Speech, image generation and video generation APIs.

Python1.3k234Updated 9 hours ago

image-generationimage-to-videomcpmcp-servermcp-toolstext-to-imagetext-to-speechtext-to-videovideo-generationvoice-cloning

gitmylo/audio-webui

A webui for different audio related Neural Networks

Python1.2k113Updated 22 hours ago

aiaioall-in-oneartificial-intelligenceaudiocraftaudioldmbarkbark-guigenerative-audiogenerative-musicmusicrvcrvc-guitext-to-audiotext-to-speechttsvoice-cloning

panyanyany/Twocast

AI Podcast Generator for bilingual episodes, Multi Languages, Alternative to NotebookLLM；真人对话AI播客生成器，多语言，多音色

TypeScript1.1k111Updated 8 hours ago

podcastpodcast-generatorvoice-cloningvoice-synthesis

devnen/Chatterbox-TTS-Server

Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible API endpoints (incl. OpenAI compatible), predefined voices, voice cloning, and large audiobook-scale text processing. Runs accelerated on NVIDIA (CUDA), AMD (ROCm), and CPU.

Python1.1k261Updated 21 hours ago

aiapi-serveraudio-generationchatterboxchatterbox-ttscudafastapihuggingfaceopenai-apipythonpytorchrocmspeech-synthesisspeech-synthesis-apitext-to-speechttstts-apivoice-cloningweb-ui

stepfun-ai/Step-Audio-EditX

A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics, and features robust zero-shot text-to-speech

Python88059Updated 12 hours ago

audio-editingcross-lingualemotion-controlparalinguisticsreinforcement-learningspeaking-stylestyle-controltext-to-speechttsvoice-cloningzero-shot-tts

Tomiinek/Multilingual_Text_to_SpeechArchived

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.

Python844158Updated 1 month ago

code-switchingmultilingualspeech-synthesistext-to-speechttsvoice-cloning

OpenMOSS/MOSS-TTS

MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the OpenMOSS team. It is designed for high‑fidelity, high‑expressiveness, and complex real‑world scenarios, covering stable long‑form speech, multi‑speaker dialogue, voice/character design, environmental sound effects, and real‑time streaming TTS.

Python84373Updated just now

audioaudio-tokenizerllmmultimodaltext-to-speechvoice-cloning

diodiogod/TTS-Audio-Suite

A ComfyUI custom node integration for multi-engine multi-language Text-to-Speech and Voice Conversion. Supports: RVC, Echo-TTS, Qwen3-TTS, Cozy Voice 3, Step Audio EditX, IndexTTS-2, Chatterbox (classic and multilingual 23-lang), F5-TTS, Higgs Audio 2 and VibeVoice with unlimited text length, SRT timing, Character support, and many audio tools

Python74066Updated 3 hours ago

ai-audioaudioaudio-editingaudio-generationaudio-processingchatterboxcomfyuicozy-voice-3echo-ttsf5f5-ttshiggs-audioindextts-2qwen3-ttsrvctext-to-speechttsvibevoicevoice-cloningvoice-conversion

gitmylo/bark-voice-cloning-HuBERT-quantizer

The code for the bark-voicecloning model. Training and inference.

Python710116Updated 15 hours ago

aineural-networkstext-to-speechvoice-cloningvoice-conversion

PlayVoice/lora-svc

singing voice change based on whisper, and lora for singing voice clone

Python64879Updated 1 month ago

lorasinging-voice-conversionspeech-to-singuni-svcvitsvits-svcvoice-changevoice-cloningvoice-conversionwhisper

fluxions-ai/vui

100M parameter lightweight conversational text-to-speech model with breaths, laughter, multi-speaker dialogue, voice cloning, and streaming. Llama-based, on-device.

Python64162Updated 4 days ago

audio-generationconversational-aiedge-ailightweightllamamulti-speakeron-devicepytorchspeech-synthesisstreamingtext-to-speechttsvoice-aivoice-cloning

PaddlePaddle/ParakeetArchived

PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN)

Python61883Updated 2 weeks ago

fastpitchfastspeech2ge2emulti-speaker-ttsparallelwaveganspeech-synthesisspeedyspeechtacotron2text-frontendtext-to-speechtransformer-ttsvoice-cloningwaveflow

jackaduma/CycleGAN-VC2

Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2

Python571109Updated 2 weeks ago

aigccyclegancyclegan-vccyclegan-vc2deep-learningdeeplearningganpix2pixpytorch-implementationspeech-synthesisvoice-cloningvoice-conversion

Page 1 of 13