50 results for “topic:real-time-transcription”
OBS plugin for local speech recognition and captioning using AI
A desktop application that uses AI to translate voice between languages in real time, while preserving the speaker's tone and emotion.
CleanStream is an OBS plugin that uses AI to clean live audio streams from unwanted words and utterances
Get started using Deepgram's Live Transcription with this Flask demo app
Get started using Deepgram's Live Transcription with this Node demo app
图会议 非线性会议软件 是一个颠覆传统会议方式的协作范式,GraphMeeting 把“所有人一起开麦”变成“每人一间单人录音棚”,系统再把所有声音实时拼成一张共享思维地图。专注于通过非线性结构展现和扩充每个人的想法。它不仅仅是一个会议工具,更是一个促进讨论和协作的环境,让参与者能够自由地表达自己的观点,同时通过智能技术帮助每个想法找到可能性、可行性,并最终形成可以实践的行动路径。每个参与者的声音都会被精准捕捉,AI将自动整理讨论内容,形成一个动态、不断扩展的思维结构图,帮助团队理清复杂问题,达成共识,并推进决策和行动。
Python SDKs for Speechmatics APIs
A real-time transcription and translation tool implemented in Python based on the fast-whisper library.
Real-time conversation assistant with dual audio transcription and GPT-powered responses, perfect for meetings and interviews.
#3 Winner of Best Use of Zoom API at Stanford TreeHacks 2025! An AI-powered meeting assistant that captures video, audio and textual context from Zoom calls using multimodal RAG.
Fluent Edge is an intelligent, real-time web application designed to help users improve their spoken English by analyzing live speech for transcription accuracy, grammar errors, and fluency quality. It combines speech recognition, punctuation restoration, grammar checking, and accuracy scoring in a seamless and interactive interface.
Get started using Deepgram's Live Transcription with this Go demo app
A Stealthy, AI-Agnostic Desktop Utility built with Python (PySide/Qt). Features screen-share invisibility, real-time transcription, and multi-LLM support (Gemini, Ollama).
Real-time system audio captions and translation into English. Built using faster-whisper, PyQt6, and Torch.
WhisperVoice is a browser extension that converts speech to text in real-time using speech recognition APIs. It’s perfect for quick transcriptions, note-taking, and accessibility, supporting multiple languages and customizable settings for a tailored experience.
This project is a real-time audio transcription system that captures and transcribes speech using Silero VAD and Faster-Whisper.
Python tool for real-time voice recognition and multilingual translation
HotLine is a minimalist, open-source speech-to-text tool for Wayland, using OpenAI's realtime transcription API with a retro telephony vibe.
🛠️ Enhance productivity with StealthIt, a hidden AI utility for screen analysis and voice interaction, ensuring efficiency without detection.
Neural Voice Agent core constructs for conversational AI.
Get started using Deepgram's Live Transcription with this Deno demo app
Professional-grade, real-time voice-to-text for Windows. Stream your voice directly to any application with ultra-low latency. Supports Arabic (RTL), English, and 50+ languages.
Dialogue-based translation web-app with support for Nigerian Languages, with FastAPI connection to backend
🎙️ 高性能 C++ 语音引擎 - 实时音频处理 + AI 语音识别 + 边录边转写 | High-performance C++ voice engine with real-time ASR and RNNoise
Get started using Deepgram's Live STT with this FastAPI demo app
London Travel Guide is your personal assistant for exploring the vibrant city of London. Whether you're looking for iconic landmarks, hidden gems, top dining spots, or the best ways to get around, we're here to help. Let us make your trip seamless and unforgettable!
RealTimeTranscriber is an application that leverages the AssemblyAI platform to perform real-time transcription of audio input.
Voice Mate is a React-based web app that converts speech to text and text to speech using the Web Speech API. It offers real-time voice recognition and speech synthesis in the browser, enhancing accessibility, communication, and productivity without needing a backend.
Get started using Deepgram's Live Transcription with this Django demo app
Advanced Discord AI voice bot with real-time Whisper transcription, Ollama AI responses, user analytics, and music playback. Features 200+ voice commands, emotion detection, and comprehensive server statistics. Self-hosted and privacy-focused.