158 results for “topic:speaker-identification”
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Frontier CoreML audio models in your apps — text-to-speech, speech-to-text, voice activity detection, and speaker diarization. In Swift, powered by SOTA open source.
SincNet is a neural architecture for efficiently processing raw audio samples.
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
Speaker Identification System (upto 100% accuracy); built using Python 2.7 and python_speech_features library
Simple d-vector based Speaker Recognition (verification and identification) using Pytorch
very fast speech-to-text, diarization, streaming (even in CPU) with NVIDIA Parakeet in Rust
Deep Learning - one shot learning for speaker recognition using Filter Banks
Identifying people from small audio fragments
Official Implementation of the work "Audio Mamba: Bidirectional State Space Model for Audio Representation Learning"
打造最简单的TTS前端集合,最简单的有声小说制作工作流。基于正则规则对小说进行分句,基于RoBERTa对小说中的对话进行说话人识别,从而实现一键式生成多人有声小说。多说话人的语音合成,高质量的有声小说制作。
A light weight neural speaker embeddings extraction based on Kaldi and PyTorch.
[SLT'24] The official implementation of SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model
This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
Source code for paper "Who is real Bob? Adversarial Attacks on Speaker Recognition Systems" (IEEE S&P 2021)
Pytorch implementation of "Generalized End-to-End Loss for Speaker Verification"
Pytorch implementation of Generalized End-to-End Loss for speaker verification
A tool for summarizing dialogues from videos or audio
mirror of VoxCeleb dataset - a large-scale speaker identification dataset
Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO
Keras Implementation of Deepmind's WaveNet for Supervised Learning Tasks
Multi-speaker separation, identification, diarization ALL-IN-ONE. It can isolate the target speaker from a conversation audio and do ASR.
Speakerbox: Fine-tune Audio Transformers for speaker identification.
声纹识别(Voiceprint Recognition, VPR),也称为说话人识别(Speaker Recognition),有两类,即说话人辨认(Speaker Identification)和说话人确认(Speaker Verification)
:sound: :boy: :girl: :woman: :man: Speaker identification using voice MFCCs and GMM
Implementation of the paper "Attentive Statistics Pooling for Deep Speaker Embedding" in Pytorch
this master thesis project is based on OpenAI Whisper with the goal to transcibe interviews