151 results for “topic:diarization”
Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.
Synchronized Translation for Videos. Video dubbing
turnkey self-hosted offline transcription and diarization service with llm summary
UniSpeech - Large Scale Self-Supervised Learning for Speech
Open source inference code for Rev's model
一站式全自动字幕生成软件,下载、转录、翻译、压制全流程覆盖,无需人工介入 / One-stop automated subtitle generator. Handles downloading, transcription, translation, and hardcoding—zero human intervention required.
Gecko - A Tool for Effective Annotation of Human Conversations
Rust bindings to https://github.com/k2-fsa/sherpa-onnx
Very fast, accurate speaker diarization
A fully local and private Speech-To-Text app with cross-platform support, speaker diarization, Audio Notebook mode, LM Studio integration, and both longform and live transcription.
Identify the emotion of multiple speakers in an Audio Segment
Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code
🎙️ Automatically transcribe audio/video into high-quality, speaker-specific Text-To-Speech datasets ✨
pyannote audio diarization in rust
Python package for combining diarization system outputs.
Callytics is an advanced call analytics solution that leverages speech recognition and large language models (LLMs) technologies to analyze phone conversations from customer service and call centers.
Локальное и бесплатное распознавание речи с помощью OpenAI Whisper. Автоматизируйте расшифровку лекций и совещаний на вашем ПК без облачных сервисов и подписок
On-device speaker diarization powered by deep learning
Tool for automatic transcription and speaker diarization based on whisper and pyannote.
A lightweight library to compute Diarization Error Rate (DER).
EchoInStone is an audio processing tool that transcribes, diarizes, and aligns speaker segments from audio files, prioritizing accuracy and reliability.
Neural network based similarity scoring for diarization (pytorch implementation of "LSTM based Similarity Measurement with Spectral Clustering for Speaker Diarization")
🎙️ Drop-in replacement for paid transcription APIs. Self-hosted, GPU-powered, speaker diarization. Free forever: uvx murmurai
Speaker diarization for Python — "who spoke when?" CPU-only, no API keys, Apache 2.0. ~10.8% DER on VoxConverse, 8x faster than real-time.
Simple Python package for fast DER computation
PAFTS : Library That Preprocessing Audio For TTS.
Easy to use Multi-Provider ASR/Speech To Text and NLP engine
Whisper Speaker Identification (WSI), a cutting-edge model for multilingual speaker identification.
Transcription from mp3 files to html with or without embedded player
Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.