70 results for “topic:pyannote”
Instantly generate AI-powered subtitles on your device. Works standalone or connects to DaVinci Resolve.
Cutting edge AI technology for automated audio transcription. A nice GUI for OpenAIs Whisper and pyannote (speaker identification)
No description provided.
Open source inference code for Rev's model
Very fast, accurate speaker diarization
Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.
Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.
EchoInStone is an audio processing tool that transcribes, diarizes, and aligns speaker segments from audio files, prioritizing accuracy and reliability.
Official repository for Mamba-based Segmentation Model for Speaker Diarization
Transcription from mp3 files to html with or without embedded player
Ultra-fast, customizable speech-to-text and speaker diarization for noisy, multi-speaker audio. Includes advanced noise reduction, stereo channel support, and flexible audio preprocessing—ideal for call centers, meetings, and podcasts.
Real-time speaker diarization using straightforward, intuitive logic - High accuracy thanks to SpeechBrain/Pyannote-WeSpeaker models
Automatically setup the AISHELL-4 and MSDWild dataset for usage with pyannote-database (and pyannote-audio)
Multi-source transcript merging inspired by textual criticism — LLM adjudicates multiple Whisper, YouTube captions & external transcripts for higher quality. Includes speaker diarization and summarization.
speech to text gui for different (mostly Whisper, also Voxtral) models and backends, including whisper.cpp, mlx-whisper, faster-whisper, ctranslate2; applies pyannote for diarization
A package that can be locally executed to generate minutes in Japanese
you feed in a video; it outputs context contained clips resized to 9:16, keeping speaker in center
Faster Whisper with Speaker Diarization
GPU-accelerated WhisperX on NVIDIA Blackwell (SM_121) - DGX Spark compatible
ASR (Automatic Speech Recognition) Notebooks
🎵 Complete offline audio transcription system with speaker diarization using OpenAI Whisper and PyAnnote. Features automatic audio cleaning, precise timestamps, multiple output formats (JSON/TXT/Markdown), and support for 20+ audio formats. No external APIs required - works entirely offline.
Voice interface for OpenClaw with speaker recognition, voice-gated security, real-time barge-in, and multi-provider streaming TTS
Verbatim Swedish Whisper transcription and speaker diarization with word-level time stamps
A versatile video localization tool that provides dubbing and audio synchronization with real-time pitch, accent, and emotional tone adjustments across multiple languages. 🗣️
Transcript a big audio file with speaker diarization using NVidia parakeet v2
Hobby project to transcribe audio files from meetings to transcripts with a summary
Subtitle generation w/ Speaker Diarization using Whisper and pyannote.audio
CLI for automating the creation of subtitles using whisper and pyannote. Accepts batch or single audio and video files.
Companion repository to the paper "On the calibration of powerset speaker diarization models" published at Interspeech 2024
Next revolution of pyannote