"topic:stt" — Search

975 results for “topic:stt”

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

Python33.3k2.0kUpdated 1 hour ago

agentaiassistantchatchatgptemacsimage-generationllama3llamacppllmobsidianobsidian-mdoffline-llmproductivityragresearchself-hostedsemantic-searchsttwhatsapp-ai

alphacep/vosk-api

Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Jupyter Notebook14.4k1.7kUpdated 9 hours ago

androidasrdeep-learningdeep-neural-networksdeepspeechgoogle-speech-to-textioskaldiofflineprivacypythonraspberry-pispeaker-identificationspeaker-verificationspeech-recognitionspeech-to-textspeech-to-text-androidsttvoice-recognitionvosk

GetStream/Vision-Agents

Open Vision Agents by Stream. Build Vision Agents quickly with any model or video provider. Uses Stream's edge network for ultra-low latency.

Python7.3k566Updated 4 hours ago

agentic-aiagentsaiai-agentsrealtimesttttsvideo-agentsvideo-aivision-aivoice-ai

jianchang512/stt

Voice Recognition to Text Tool / 一个离线运行的本地音视频转字幕工具，输出json、srt字幕、纯文字格式

Python4.3k464Updated 14 hours ago

speechspeech-recognitionspeech-to-textstt

pluja/whishper

Transcribe any audio to text, translate and edit subtitles 100% locally with a web UI. Powered by whisper models!

Svelte2.9k169Updated 3 hours ago

aiaudio-to-textgolangspeech-recognitionspeech-to-textsttsubtitlessveltekittranscriptionuiwebweb-whisperwebappwhisper

coqui-ai/STT

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

C++2.6k302Updated 4 days ago

asrautomatic-speech-recognitiondeep-learningspeech-recognitionspeech-recognition-apispeech-recognizerspeech-to-textstttensorflowvoice-recognition

pannous/tensorflow-speech-recognition

🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

Python2.2k634Updated 3 weeks ago

deep-learningneural-networkspeech-recognitionspeech-to-textstttensorflow

neural-maze/ava-whatsapp-agent-course

Meet Ava, the WhatsApp Agent

Python1.6k419Updated 2 days ago

agentagent-basedagentic-workflowagentssttttsvector-database

mkiol/dsnote

Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.

C++1.4k61Updated 7 hours ago

asrflatpak-applicationslinux-desktopmachine-translationnmtofflinesailfishosspeech-recognitionspeech-synthesisspeech-to-textstttext-to-speechtranslationtranslatortts

coqui-ai/open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

1.4k150Updated 3 days ago

speech-emotion-recognitionspeech-processingspeech-recognitionspeech-separationspeech-synthesisspeech-to-textstttext-to-speechttsvoice-activity-detectionvoice-cloningvoice-recognition

lenML/Speech-AI-Forge

🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.

Python1.4k183Updated 2 days ago

agentasrchatttschattts-forgechinesecolabcosy-voicecosyvoiceenglishfireredfireredttsfish-speechgptllamallmssmlstttext-to-speechttswhisper

R3gm/SoniTranslate

Synchronized Translation for Videos. Video dubbing

Python1.3k318Updated 2 days ago

asraudio-processingautomatic-dubbingdiarizationdocument-translatordubbingspeech-to-textsttsubtitle-to-speechtext-to-speechtranslate-audiotranslate-videotranslationttsvideo-dubbing

Robitx/gp.nvim

Gp.nvim (GPT prompt) Neovim AI plugin: ChatGPT sessions & Instructable text/code operations & Speech to text [OpenAI, Ollama, Anthropic, ..]

Lua1.3k126Updated 3 days ago

claudecodeiumcopilotgeminigpt-4ogpt4ollmluamistralneovimnvimollamaparrotperplexitysonnetspeech-to-textsttvimvoicewhisper

Stypox/dicio-android

Dicio assistant app for Android

Kotlin1.3k136Updated 23 hours ago

androidassistantdiciopersonal-assistantskillssttttsvoice-assistantvoskwakeword

joey-zhou/xiaozhi-esp32-server-java

小智ESP32的Java企业级管理平台，提供设备监控、音色定制、角色切换和对话记录管理的前后端及服务端一体化解决方案

Java1.2k425Updated 21 hours ago

esp32javamcpmcp-clientmcp-serverspring-aisttttsxiaozhixiaozhi-aixiaozhi-esp32xiaozhi-server

snakers4/open_sttArchived

Open STT

Python81887Updated 1 month ago

asrautomatic-speech-recognitiondatasetrussianspeech-to-textstt

VRCWizard/TTS-Voice-Wizard

Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) (VTuber TTS)

C#77682Updated 2 days ago

chatboxdiscordfreeheart-rateoscspeech-recognitionspeech-to-textspotifystttext-to-speechttsvoicevrchatvtuber

lobehub/lobe-tts

🎤 Lobe TTS - A high-quality & reliable TTS/STT library for Server and Browser

TypeScript77497Updated 1 day ago

auzrebunedgelobehubmicrosoft-speech-apinodejsopeanaireactspeech-recognitionspeech-to-textstttext-to-speechtts

Macoron/whisper.unity

Running speech to text model (whisper.cpp) in Unity3d on your local machine.

C#701166Updated 1 week ago

asropenaispeech-recognitionspeech-to-textsttunity3dwhisper

madroidmaq/mlx-omni-server

MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements OpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.

Python67484Updated just now

function-callinggenaimlxopenaiopenai-apistructured-outputstttoolstts

Picovoice/cheetah

On-device streaming speech-to-text engine powered by deep learning

Python66176Updated 7 hours ago

asrautomatic-speech-recognitiononline-speech-recognitionspeech-recognitionspeech-to-textstreaming-speech-to-textstttranscriptionvoice-recognition

evancohen/sonus

:speech_balloon: /so.nus/ STT (speech to text) for Node with offline hotword detection

JavaScript63777Updated 3 days ago

alexahotword-detectionkeyword-spottingnodespeechspeech-recognitionspeech-to-textsttvoice-controlvoice-recognition

bbc/react-transcript-editor

A React component to make correcting automated transcriptions of audio and video easier and faster. By BBC News Labs. - Work in progress

JavaScript613169Updated 1 week ago

bbc-news-labskaldinews-labsreactstttextavtranscripttranscript-editortranscription

StarmoonAI/Starmoon

A conversational, AI device + software framework for companionship, entertainment, education, healthcare, IoT applications, and DIY robotics. Built with Python, NextJS, Arduino, ESP32, LLMs (GPT-4o), Deepgram STT and Azure TTS 🤖

TypeScript54462Updated 1 week ago

esp32gptiotllmopenairoboticssttttsvoice-assistant

waybarrios/vllm-mlx

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.

Python54475Updated just now

anthropicapple-siliconaudio-processingclaude-codecomputer-visionimage-understandinginferencellmmachine-learningmacosmllmmlxmultimodal-aispeech-to-textstttext-to-speechttsvideo-understandingvision-language-modelvllm

ccoreilly/vosk-browser

A speech recognition library running in the browser thanks to a WebAssembly build of Vosk

JavaScript50787Updated 1 week ago

asrkaldispeech-recognitionspeech-to-textstttypescriptvoskwasmwebassembly

voice-engine/make-a-smart-speaker

A collection of resources to make a smart speaker

47594Updated 1 day ago

aecbeamformingkwsnlusttttsvoice-assistant

Picovoice/leopard

On-device speech-to-text engine powered by deep learning

Python47329Updated 5 hours ago

asrautomatic-speech-recognitionon-devicespeech-recognitionspeech-to-textstttranscriptionvoice-recognitionvoice-to-text

OpenNewsLabs/autoEdit_2

Fast text based video editing, node Electron Os X desktop app, with Backbone front end.

JavaScript45056Updated 2 weeks ago

autoeditbackbonebackbonejsdesktopdmgedlelectrongentlegentle-sttibm-watsonibm-watson-speechmacosxspeech-to-textspeechmaticsstttranscriptionvideo-editingvideo-sequenceswatson

gia-guar/JARVIS-ChatGPTArchived

A Conversational Assistant equipped with synthetic voices including J.A.R.V.I.S's. Powered by OpenAI and IBM Watson APIs and a Tacotron model for voice generation.

Python445101Updated 2 weeks ago

aichat-gpt-3chatgptchatgpt-apielevenlabsibm-watsonjarvis-aiopenaipythonpytorchspeech-recognitionstttacotrontts

Page 1 of 33