"topic:speech-processing" — Search | GitHunt

© 2026 GitHunt · tansuasici

763 results for “topic:speech-processing”

speechbrain/speechbrain

A PyTorch-based Speech Toolkit

Python11.3k1.7kUpdated 4 hours ago

asraudioaudio-processingdeep-learninghuggingfacelanguage-modelpytorchspeaker-diarizationspeaker-recognitionspeaker-verificationspeech-enhancementspeech-processingspeech-recognitionspeech-separationspeech-to-textspeech-toolkitspeechrecognitionspoken-language-understandingtransformersvoice-recognition

pyannote/pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Jupyter Notebook9.3k1.0kUpdated 5 hours ago

overlapped-speech-detectionpretrained-modelspytorchspeaker-change-detectionspeaker-diarizationspeaker-embeddingspeaker-recognitionspeaker-verificationspeech-activity-detectionspeech-processingvoice-activity-detection

snakers4/silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python8.4k738Updated 3 hours ago

onnxonnx-runtimeonnxruntimepytorchspeechspeech-processingvadvoice-activity-detectionvoice-commandsvoice-controlvoice-detectionvoice-recognition

pliang279/awesome-multimodal-ml

Reading list for research topics in multimodal machine learning

6.8k897Updated 8 hours ago

computer-visiondeep-learninghealthcaremachine-learningmultimodal-learningnatural-language-processingreading-listreinforcement-learningrepresentation-learningroboticsspeech-processing

microsoft/torchscale

Foundation Architecture for (M)LLMs

Python3.1k225Updated 4 days ago

computer-visionmachine-learningmultimodalnatural-language-processingpretrained-language-modelspeech-processingtransformertranslation

linto-ai/whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Python2.8k208Updated 2 days ago

asrattention-is-all-you-needattention-mechanismattention-modelattention-networkattention-seq2seqattention-visualizationdeep-learningmachine-learningmultilingual-modelspythonpython3pytorchspeaker-diarizationspeechspeech-processingspeech-recognitionspeech-to-texttransformerswhisper

r9y9/wavenet_vocoder

WaveNet vocoder

Python2.4k496Updated 2 weeks ago

neural-vocoderpythonpytorchspeechspeech-processingspeech-synthesiswavenetwavenet-vocoder

resemble-ai/resemble-enhance

AI powered speech denoising and enhancement

Python2.2k267Updated 2 days ago

denoisespeech-denoisingspeech-enhancementspeech-processing

DigitalPhonetics/IMS-Toucan

Controllable and fast Text-to-Speech for over 7000 languages!

Python2.2k320Updated 3 days ago

deep-learningpytorchspeechspeech-processingspeech-synthesistext-to-speechtoolkittts

TEN-framework/ten-vad

Voice Activity Detector (VAD) : low-latency, high-performance and lightweight

C2.0k160Updated 9 hours ago

audioautomatic-speech-recognitionconversational-aireal-timesilero-vadspeechspeech-processingvadvoice-activity-detectionvoice-agentvoice-commandsvoice-recognition

r9y9/deepvoice3_pytorch

PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Python2.0k482Updated 1 month ago

end-to-endmachine-learningmulti-speakerpythonpytorchspeech-processingspeech-synthesistts

wq2012/awesome-diarization

A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.

1.9k238Updated 1 week ago

awesomeawesome-listdeep-learningmachine-learningspeaker-diarizationspeech-processingspeech-recognition

coqui-ai/open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

1.4k150Updated 3 days ago

speech-emotion-recognitionspeech-processingspeech-recognitionspeech-separationspeech-synthesisspeech-to-textstttext-to-speechttsvoice-activity-detectionvoice-cloningvoice-recognition

haoheliu/voicefixer

General Speech Restoration

Python1.3k154Updated 1 day ago

declippingdenoisedereverberationmelspeechspeech-analysisspeech-enhancementspeech-processingspeech-synthesissuper-resolutionttsvocoder

ictnlp/StreamSpeech

StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Python1.2k101Updated 21 hours ago

all-in-oneasraudio-processingmachine-translationnon-autoregressiveseamlesssimultaneous-translationspeechspeech-enhancementspeech-processingspeech-recognitionspeech-synthesisspeech-to-textspeech-translationstreaming-audiotext-to-audiotext-to-speechtranslationttsvoice

mravanelli/SincNet

SincNet is a neural architecture for efficiently processing raw audio samples.

Python1.2k270Updated 6 hours ago

artificial-intelligenceasraudioaudio-processingcnnconvolutional-neural-networksdeep-learningdigital-signal-processingfilteringneural-networkspythonpytorchsignal-processingspeaker-identificationspeaker-recognitionspeaker-verificationspeech-processingspeech-recognitiontimitwaveform

midas-research/audino

Open source audio annotation tool for humans

TypeScript1.1k141Updated 5 days ago

annotation-toolaudio-annotationaudio-processingdatasetsmachine-learningpythonspeech-processing

X-LANCE/SLAM-LLM

A Framework for Speech, Language, Audio, Music Processing with Large Language Model

Python1.0k108Updated 20 hours ago

audio-processinglarge-language-modelmultimodal-large-language-modelsmusic-processingpeftspeech-processing

nyrahealth/CrisperWhisper

Verbatim Automatic Speech Recognition with improved word-level timestamps and filler detection

Python92448Updated 1 day ago

asraudiodetectionfillerrecognitionspeechspeech-processingspeech-recognitiontimestampstranscriptionverbatimwhisper

Ryuk17/SpeechAlgorithms

You can find the speech algorithms you want here

C854264Updated 4 days ago

speech-processing

HenryNdubuaku/maths-cs-ai-compendium

Become a cracked AI/ML Research Engineer

JavaScript845120Updated just now

ai-textbookalgorithmsartificial-intelligencecomputer-sciencecomputer-visiondeep-learningjaxlinear-algebramachine-learningmachine-learning-algorithmsmathmathematicsmultimodal-learningnlpprobabilitypythonreinforcement-learningspeech-processingstatistics

nanahou/Awesome-Speech-Enhancement

A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.

MATLAB819153Updated 1 week ago

deep-neural-networksmachine-learning-algorithmssignal-processingspeech-enhancementspeech-processing

drethage/speech-denoising-wavenet

A neural network for end-to-end speech denoising

Python708163Updated 1 month ago

deep-learningend-to-endmachine-learningneural-networksspeechspeech-denoisingspeech-processingwavenet

Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.

Python699172Updated 3 days ago

audioaudio-processingdeep-learningdns-challengedtln-modelkerasnoise-reductionnoise-suppressiononnxraspberry-pireal-time-audiospeech-denoisingspeech-enhancementspeech-processingtensorflowtf-lite

pliang279/MultiBench

[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning

HTML61391Updated 3 days ago

computer-visiondeep-learninghealthcaremachine-learningmultimodal-learningnatural-language-processingrepresentation-learningroboticsspeech-processing

huawei-noah/Speech-Backbones

This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.

Jupyter Notebook602130Updated 1 week ago

speech-processingspeech-recognitionspeech-synthesis

ddlBoJack/Speech-Resources

语音方向实验室/公司/资源/实习等，欢迎推荐或自荐

59868Updated 1 week ago

speechspeech-processing

Audio-WestlakeU/FullSubNet

PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."

Python598159Updated 3 days ago

audiobanddenoisingfull-bandnarrow-bandnoise-reductionpaperpretrained-modelpytorchreproducible-researchsingle-channelspeechspeech-enhancementspeech-processingspeech-separationsub-band

SuperKogito/spafe

:sound: spafe: Simplified Python Audio Features Extraction

Python48079Updated 8 hours ago

audioaudio-analysisbeatdspfeatures-extractionfilterbankfrequenciesfrequencyfrequency-analysisgammatone-filterbanksmfccmusicmusic-information-retrievalpitchpythonsignal-processingsoundspeech-processingtime-frequency-analysisvoice

microsoft/UniSpeech

UniSpeech - Large Scale Self-Supervised Learning for Speech

Python47974Updated 2 weeks ago

diarizationpytorchspeaker-verificationspeechspeech-diarizationspeech-processingspeech-recognitionspeech-separation

Page 1 of 26