25 results for “topic:paraformer”
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
📣 商用级开源语音自动识别程序库,开箱即用,全平台支持,中英文混合识别。A Cross-platform implementation of ASR inference. It's based on ONNXRuntime and FunASR. We provide a set of easier APIs to call ASR models.
AI Speech Solutions for Tasks such as ASR, Vocal Extraction, Accompaniment Extraction, Audio Denoising, and Enhancement, Support models such as paraformer, sensevoice, fireredasr, zipformer, moonshine, wenet, whisper, fsmn-vad, silero-vad, CT Transformer punc, Spleeter, Uvr5, etc, apply ONNX models in various scenarios.
go-speech 基于 Golang + ONNX 构建的轻量语音库,支持 TTS(文本转语音)与 ASR(语音转文字)。已集成 MeloTTS、Piper、达摩院 Paraformer 架构模型、Whisper 模型。
Podcast Summarizer with LLM Technology
Optimized loss based on cross-entropy (CE), like MWER (minimum WER) Loss with beam search and negative sampling strategy, Smoothed Max Pooling Loss.
这是基于FunASR实现的区分说话人语音识别API | This is a speaker-diarization-based speech recognition API implemented using FunASR.
Realtime ASR for Desktop Subtitle
FunASR实时语音识别版,识别麦克风和电脑内播放的声音,电脑语音打字软件
一个基于qwen-max-latest(LLM) + paraformer-realtime-v2(ASR)的一个实时语音AI面试助手
语音转文本的各类python封装实现(paraformer、whisper_online、whisper_offline、funasr),用于服务kuon仓库
React Native TurboModule for Sherpa-ONNX offline on-device Speech Processing (STT/TTS/Diarization/VAD) completely offline on the device. Support for Android & iOS
Python pipeline to run Paraformer ASR on Intel CPU/GPU/NPU thru ONNX Runtime + OpenVINO Execution Provider
Offline Speech-to-Text for React Native using sherpa-onnx Supports Zipformer, Paraformer, NeMo CTC, Whisper & more.
Real-time voice-to-text transcription with support for multiple AI models and integration with external AI services
🎙 Automated offline video transcription for macOS — FunASR + speaker diarization + language detection (zh/en/mixed). Zero cloud costs, 100% local.
视频/音频一键转文字 + AI 总结 | CLI / Web UI / Telegram Bot 三合一
🎤 Enable voice recognition for the Doubao input method using Python; ideal for learning and research with a focus on audio processing.
No description provided.
🎙️ Enable real-time speech-to-text and translation on macOS with multiple ASR backends and smooth overlay interface for seamless communication.
🎙 Automate offline, local transcription of audio/video on macOS with speaker diarization and no cloud costs.
speech processing tool using Go
🎤 Deploy a simplified voice synthesis service with Fun-CosyVoice3-0.5B-2512, featuring real-time audio output and advanced performance optimizations.
🎨 Explore clip-path techniques in HTML and CSS to create interactive menus and dynamic shapes without JavaScript for responsive design.
🎤 Enable offline speech recognition in React Native using sherpa-onnx, supporting various model architectures for reliable performance.