GitHunt — Discover GitHub Repositories

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Jupyter Notebook17.4k1.6kUpdated just now

3d-whole-body-pose-estimationautomatic-labeling-systemcaptiondata-generationimage-editing+3

kaldi-asr/kaldi

kaldi-asr/kaldi is the official location of the Kaldi project.

Shell15.3k5.4kUpdated 2 days ago

c-plus-pluscudakaldishellspeaker-id+4

AIGC-Audio/AudioGPT

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Python10.2k866Updated 1 day ago

audiogptmusicsoundspeech+1

mozilla/TTS

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)

Jupyter Notebook10.1k1.3kUpdated 1 day ago

dataset-analysisdeep-learningganttsglow-ttsmelgan+11

modelscope/modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

Python8.8k917Updated 10 hours ago

cvdeep-learningmachine-learningmulti-modalnlp+3

netease-youdao/EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Python8.5k746Updated 5 hours ago

aideep-learningemotionemotivoicemulti-speaker+8

snakers4/silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python8.4k735Updated 2 hours ago

onnxonnx-runtimeonnxruntimepytorchspeech+7

PaddlePaddle/models

Officially maintained, supported by PaddlePaddle, including CV, NLP, Speech, Rec, TS, big models and so on.

Python6.9k2.9kUpdated 2 days ago

computer-visioncvdeep-learningmodelsnatural-language-processing+5

TalAter/annyang

💬 Speech recognition for your site

JavaScript6.7k1.0kUpdated 3 days ago

speechspeech-recognitionspeech-to-textvoice

OpenBMB/VoxCPM

VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

Python6.1k733Updated 1 hour ago

audiodeeplearningminicpmpythonpytorch+6

snakers4/silero-models

Silero Models: pre-trained text-to-speech models made embarrassingly simple

Jupyter Notebook5.8k359Updated 7 hours ago

armenianazerbaijanibelaruscolabgeorgian+15

MahmoudAshraf97/whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

Jupyter Notebook5.4k497Updated 4 hours ago

asrspeaker-diarizationspeechspeech-recognitionspeech-to-text+1

huggingface/speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Python4.5k507Updated 1 day ago

aiassistantlanguage-modelmachine-learningpython+4

cactus-compute/cactus

Low-latency AI engine for mobile devices & wearables

C4.4k327Updated 1 hour ago

aiandroidarmedgeedge-ai+15

fixie-ai/ultravox

A fast multimodal LLM for real-time voice

Python4.4k365Updated 13 hours ago

aillmslmspeech

jianchang512/stt

Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具，输出json、srt字幕、纯文字格式

Python4.3k465Updated 2 hours ago

speechspeech-recognitionspeech-to-textstt

metavoiceio/metavoice-src

Foundational model for human-like, expressive TTS

Python4.2k691Updated 1 day ago

aideep-learningpytorchspeechspeech-synthesis+4

modelscope/ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python4.0k324Updated 10 hours ago

audiobandwidth-extensiondeep-learningnoise-suppressionpytorch+6

Rikorose/DeepFilterNet

Noise supression using deep filtering

Python3.9k421Updated just now

audiodeep-learningnoise-suppressionpytorchrust+2

avinashkranjan/Amazing-Python-Scripts

🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.

Jupyter Notebook3.5k1.3kUpdated 1 day ago

artificial-intelligencehacktoberfestmachine-learningprojectspython+4

shu223/iOS-10-Sampler

Code examples for new APIs of iOS 10.