32 results for “topic:hifi-gan”
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
OpenMusic: SOTA Text-to-music (TTM) Generation
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS
Vietnamese Text to Speech library
A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS
🎙️ Arabic TTS models (Tacotron2, FastPitch)
Avocodo: Generative Adversarial Network for Artifact-free Vocoder
Baseline Recipe for VoicePrivacy Challenge 2022: anonymization systems and evaluation software
PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.
Use FastSpeech2 and HiFi-GAN to easily perform end-to-end Korean speech synthesis.
A neural speech codec based on discrete WavLM representations
In this repo, I developed a step-by-step pipeline for a standard MultiSpeaker Text-to-Speech system :smile: In general, I used Portaspeech as an acoustic model and iSTFTNet as vocoder...
This is the experimental description of MnTTS2.
Train HiFi-GAN on TPU
Audio samples from "HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis"
🎙️ German TTS (FastPitch) with Thorsten voice / emotional
DelightfulTTS with Hifi-GAN and Univnet vocoders
Python package for NSF and NSF-HiFi-GAN (unofficial)
This is PyTorch Implementation of A Non-Autoregressive Transformer with unsupervised learning durations based on Transformer & Conformer blocks, supporting for Vietnamese language.
제주어 음성 합성 (보완 중)
On-device iOS Text-to-Speech using FastSpeech2 and HiFi-GAN (Japanese & English)
Aligning latent space of speaking style with human perception using a re-embedding strategy
Neural vocoder for high-fidelity speech synthesis (implementation of the referenced research)
The Speech Synthesis App converts text into natural-sounding speech using advanced models, providing an interactive platform for audio generation.
Doing devious stuff with audio
If you have a wav & transcript, can train HiFi-GAN right now.
HiFiGAN Implementation
This repository contains the code and resources associated with my Bachelor's Thesis. The project evaluates the performance of various automatic speaker verification (ASV) systems against identity spoofing attacks generated using text-to-speech (TTS) synthesis technologies.