"topic:speech-dataset" — Search

28 results for “topic:speech-dataset”

Feature extraction of speech signal is the initial stage of any speech recognition system.

feature-extractionsignal-processingspeechspeech-datasetspeech-feature-extractionspeech-featuresspeech-preprocess

MahtaFetrat/ManaTTS-Persian-Speech-Dataset

ManaTTS is the largest open Persian speech dataset with 114+ hours of transcribed audio. Includes data collection pipeline and tools. Suitable for Persian text-to-speech models.

Jupyter Notebook495Updated 10 hours ago

data-collectiondata-preprocessingdataset-preparationforced-alignmentmana-ttsmanattspersianpersian-speechspeech-corpusspeech-data-collectionspeech-datasetspeech-processingspeech-synthesistext-to-speechtext-to-speech-datasetttstts-dataset

hetpandya/youtube_tts_data_generator

A python library to generate speech dataset from Youtube videos

Python378Updated 3 weeks ago

dataset-generatorpython-libraryspeech-datasettext-to-speechtext-to-speech-datasetttstts-datasetyoutubeyoutube-datasetyoutube-dataset-generator

aaivu/EmoTa

EmoTa is an open-access Tamil Speech Emotion Recognition dataset with 936 utterances from 22 native speakers, covering five emotions (anger, happiness, sadness, fear, and neutrality). It supports emotion classification tasks and advances Tamil language processing.

Python271Updated 1 week ago

chipsalcoling2025emotaemotional-speechserspeech-datasetspeech-emotion-recognitiontamil-language-processingtamil-speech-emotion-recognition

ruslan-corpus/ruslan-corpus.github.io

No description provided.

HTML210Updated 11 months ago

russianspeech-corpusspeech-datasettext-to-speechtts

fjxmlzn/RNN-SM

[T-IFS] RNN-SM: Fast Steganalysis of VoIP Streams Using Recurrent Neural Network

Python2013Updated 5 months ago

algorithmidcrnn-smspeech-datasetss-qccnsteganalysis

gauthelo/kallaama-speech-dataset

A transcribed speech dataset in Wolof, Pulaar and Sereer, to support agriculture. Funded by Lacuna Fund.

1810Updated 1 month ago

agriculturenatural-language-processingsenegal-languagespeech-datasetspeech-processing

manankshastri/Trigger-Word-Detection

Construct a speech dataset and implement an algorithm for trigger word detection (sometimes also called keyword detection, or wakeword detection).

Jupyter Notebook166Updated 1 year ago

deep-learninggated-recurrent-unitspythonrnnspeech-datasettrigger-word-detection

petrichorwq/DECRO-dataset

Deepfake cross-lingual evaluation dataset (DECRO) is constructed to evaluate the influence of language differences on deepfake detection.

160Updated 2 weeks ago

deepfake-detectionspeech-dataset

ina-foss/InaGVAD

Voice activity detection and speaker gender segmentation audiovisual corpus

Jupyter Notebook164Updated 5 months ago

acoustic-diversityaudio-datasetaudio-segmentationaudiovisual-datasetbenchmarkcorpusdatasetgendergender-biasgender-predictiongender-representationradiospeaker-genderspeech-activity-detectionspeech-corpusspeech-datasettvvoice-activity-detection

Rumeysakeskin/Speech-Datasets-for-ASR

Download speech datasets (English and non-English) for Automatic Speech Recognition

Jupyter Notebook151Updated 10 months ago

asraudio-datasetscommon-voice-datasetspeech-datasetspeech-processingspeech-recognitionspeech-synthesisspeech-to-textvoice-datasetsvoxforge-dataset

MahtaFetrat/GPTInformal-Persian-Speech-Dataset

A free licensed Persian TTS dataset including 6+ hours of audio-text pairs with subject

101Updated 2 months ago

revsic/speechset

Numpy-librosa implementation of Speech dataset pipeline

Python96Updated 2 years ago

preprocessorspeech-datasetttsvocoder

MahtaFetrat/Mana-Forced-Aligner

A robust forced alignment tool for low-resource languages using multiple ASR models and CER-based matching. Built for noisy data and imperfect transcripts.

Jupyter Notebook60Updated 5 months ago

asrforced-alignmentlow-resource-languagesmana-ttsmanattsopen-sourcespeech-alignmentspeech-datasettts

Ijwi-ry-Ikirundi-AI/Kirundi_Dataset

🇧🇮 The first large-scale, open-source speech and text dataset for Kirundi language. Building AI models for 12M+ Kirundi speakers through community collaboration. Includes ASR, TTS, and MT capabilities.

Jupyter Notebook62Updated 4 days ago

african-languagesaiasrburundicommunity-drivenkirundilow-resource-languagemachine-learningmachine-translationnlpopen-datasetspeech-datasetspeech-recognitiontext-to-speechtts

Ralireza/PSDR

Persian spoken digit recognition

Python60Updated 2 years ago

persianpersian-datasetpersian-speech-recognitionpersian-spoken-digitspeech-analysisspeech-datasetspeech-recognitionspeech-recognizer

KanishkNavale/Speech-Emotion-RecognitionArchived

A simple CNN-LSTM deep neural model using Tensorflow to classify emotions from a speech dataset

Jupyter Notebook60Updated 4 months ago

cnndeep-learninglstmspeech-datasetspeech-emotion-recognitiontensorflow

neuralwork/speech-collector

A full-stack webapp for collecting and managing speech datasets.

TypeScript60Updated 3 months ago

collectiondatasetdataset-collectiondataset-generationspeech-datasetvoice-dataset

mborsdorf/GlobalPhoneMS_Scripts

No description provided.

MATLAB40Updated 1 year ago

auditory-attentiondeep-learningmatlabmultilingualpythonspeech-datasetspeech-separation

MahtaFetrat/VirgoolInformal-Speech-Dataset

A dataset of informal Persian audio and text chunks, along with a fully open processing pipeline, suitable for ASR and TTS tasks. Created from crawled content on virgool.io.

Jupyter Notebook42Updated 1 month ago

asrasr-evaluationforced-alignmentpersianpersian-speech-corpuspersian-speech-datasetpersian-speech-recognitionpersian-text-to-speechspeech-data-collectionspeech-datasetspeech-processingtts

nafiuny/voice_conversion_dataset

top dataset for voice conversion models

30Updated 6 months ago

audio-datasetaudio-datasetsdatasetdatasetspythpythonspeech-datasetspeech-to-texttext-to-speechttstts-datasetvcvc-datasetvoice-conversionvoice-datasetvoice-datasets

cyrta/50languages

Corpus, dataset of speech recording in 50 languages

PHP30Updated 1 year ago

corpusspeechspeech-dataset

mborsdorf/TargetLanguageExtraction

No description provided.

30Updated 6 months ago

audioaudio-processingauditory-attentiondeep-learningmatlabmultilingualpythonpytorchsource-separationspeaker-extractionspeech-corpusspeech-databasespeech-datasetspeech-processingspeech-separation

seanpm2001/AI2001_Category-Audio-SC-Speeches

🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🎼️🎷️ The audio:speeches category for AI2001, containing speech datasets

R21Updated 2 years ago

ai-2001ai-2001-datasetai-2001-developmentai2001ai2001-datasetai2001-developmentaudio-datasetdatasetgpl3gplv3mdr-languagermarkdown-languagespeech-audio-datasetspeech-datasettxt

PanosAntoniadis/fast-recorder

Simple script that creates a speech dataset quickly

Python20Updated 5 years ago

recorderspeech-datasetspeech-to-textsphinx-4

Nexdata-AI/393-Hours-Korean-Children-Speech-Data-by-Mobile-Phone

393-Hours-Korean-Children-Speech-Data-by-Mobile-Phone

00Updated 1 year ago

childrenchildren-speech-recognitionkoreanspeech-datasetspeech-recognition

Nexdata-AI/2-People-New-Zealand-English-Average-Tone-Speech-Synthesis-Corpus

2-People-New-Zealand-English-Average-Tone-Speech-Synthesis-Corpus

00Updated 1 year ago

speech-analysisspeech-datasetspeech-synthesistext-to-speech

Madwesh-india/AudioCollector

This interactive Python tool enables the recording of bilingual audio samples using PyAudio and ipywidgets. Designed for data collection tasks such as speech datasets, it provides a user-friendly interface to record, save, label, and manage audio files directly within a Jupyter Notebook.

Jupyter Notebook00Updated 7 months ago

dataset-collectionjupyter-notebooklanguage-dataspeech-dataset