975 results for “topic:stt”
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Open Vision Agents by Stream. Build Vision Agents quickly with any model or video provider. Uses Stream's edge network for ultra-low latency.
Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具,输出json、srt字幕、纯文字格式
Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks
Meet Ava, the WhatsApp Agent
Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.
Synchronized Translation for Videos. Video dubbing
Gp.nvim (GPT prompt) Neovim AI plugin: ChatGPT sessions & Instructable text/code operations & Speech to text [OpenAI, Ollama, Anthropic, ..]
Dicio assistant app for Android
小智ESP32的Java企业级管理平台,提供设备监控、音色定制、角色切换和对话记录管理的前后端及服务端一体化解决方案
Open STT
Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)
🎤 Lobe TTS - A high-quality & reliable TTS/STT library for Server and Browser
Running speech to text model (whisper.cpp) in Unity3d on your local machine.
MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements OpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.
On-device streaming speech-to-text engine powered by deep learning
:speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection
A React component to make correcting automated transcriptions of audio and video easier and faster. By BBC News Labs. - Work in progress
A conversational, AI device + software framework for companionship, entertainment, education, healthcare, IoT applications, and DIY robotics. Built with Python, NextJS, Arduino, ESP32, LLMs (GPT-4o), Deepgram STT and Azure TTS 🤖
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.
A speech recognition library running in the browser thanks to a WebAssembly build of Vosk
A collection of resources to make a smart speaker
On-device speech-to-text engine powered by deep learning
Fast text based video editing, node Electron Os X desktop app, with Backbone front end.
A Conversational Assistant equipped with synthetic voices including J.A.R.V.I.S's. Powered by OpenAI and IBM Watson APIs and a Tacotron model for voice generation.