Yuekai Zhang

FireRedASR2S is a state-of-the-art, industrial-grade, all-in-one ASR system with ASR, VAD, LID, and Punc modules. All modules achieve SOTA performance

Python00Updated 3 weeks ago

yuekaizhang/westFork

We Speech Toolkit, LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction

Python00Updated 1 month ago

yuekaizhang/Step-Audio-R1Fork

No description provided.

00Updated 1 month ago

yuekaizhang/Triton-OpenAI-Speech

OpenAI-Compatible Frontend for Nvidia Triton Inference ASR/TTS Server

Python221Updated 1 month ago

yuekaizhang/verlFork

verl: Volcano Engine Reinforcement Learning for LLMs

Python10Updated 1 month ago

yuekaizhang/r1-aqaFork

🤗 R1-AQA Model: mispeech/r1-aqa

00Updated 1 month ago

yuekaizhang/dynamoFork

A Datacenter Scale Distributed Inference Serving Framework

Rust00Updated 1 month ago

yuekaizhang/minutes

Podcast Summarizer with LLM Technology

Python306Updated 1 month ago

chatglmlangchainllmparaformerwhisper

yuekaizhang/FlashCosyVoiceFork

FlashCosyVoice: A lightweight vLLM implementation built from scratch for CosyVoice.

30Updated 2 months ago

yuekaizhang/vllm-omniFork

A framework for efficient model inference with omni-modality models

00Updated 2 months ago

yuekaizhang/sherpa-onnxFork

Real-time speech recognition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, x86_64 servers, websocket server/client, C/C++, Python, Kotlin

C++00Updated 2 months ago

yuekaizhang/FireRedASRFork

Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics recognition capability.

Python00Updated 2 months ago

yuekaizhang/diffusersFork

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

00Updated 4 months ago

yuekaizhang/flow_grpoFork

[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL

00Updated 4 months ago

yuekaizhang/mair-hubFork

No description provided.

Jupyter Notebook00Updated 4 months ago

yuekaizhang/Awesome-AudioLM-Datasets

No description provided.

90Updated 4 months ago

yuekaizhang/ZipVoiceFork

Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching

Python10Updated 4 months ago

yuekaizhang/Step-Audio2Fork

Step-Audio 2 is an end-to-end multi-modal large language model designed for industry-strength audio understanding and speech conversation.

Python10Updated 4 months ago

yuekaizhang/NeMo-GuardrailsFork

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

Python00Updated 5 months ago

yuekaizhang/NeMoFork

NeMo: a toolkit for conversational AI

Python00Updated 6 months ago

yuekaizhang/vllmFork

A high-throughput and memory-efficient inference and serving engine for LLMs

Python00Updated 6 months ago

yuekaizhang/transformersFork

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

00Updated 6 months ago

yuekaizhang/Spark-TTSFork

Spark-TTS Inference Code

Python00Updated 7 months ago

yuekaizhang/PyTriton-ASR-Server

Pytriton ASR Server

Python00Updated 8 months ago

Yuekai Zhang

Languages

Top Repositories

Repositories

Gists

Recent Activity