GitHunt

LiveNexus AI

Hero

Real-Time Hybrid Audio Intelligence Platform

Status
License
Tech

LiveNexus AI is a real-time speech intelligence system. It demonstrates how to achieve <200ms Latency Transcription on standard CPU hardware by combining WebRTC Transport (LiveKit) with optimized Edge Inference (Faster-Whisper + VAD).


๐Ÿš€ Quick Start

Get the system running in 2 steps:

# 1. Start Frontend (UI)
npm install && npm run dev

# 2. Start AI Worker (Deep Learning)
cd ai-worker && docker build -t worker . && docker run --env-file ../.env.local worker

Detailed Setup: See GETTING_STARTED.md for API Keys.


๐Ÿ“ธ Demo & Architecture

System Architecture

Architecture
WebRTC Client -> LiveKit Cloud -> Python Worker (VAD + Whisper)

Real-Time Pipeline

Data Flow
Audio In -> VAD Filter -> Inference -> DataChannel Out

Features Overview

Features
Double-Buffer UI, CPU Optimization, and Binary Transport

Deep Dive: See ARCHITECTURE.md for the VAD Gating logic.


โœจ Key Features

  • โšก <200ms Latency: Optimized quantized models run faster than cloud APIs.
  • ๐Ÿ”‡ VAD Gating: webrtcvad drops 70% of silence packets, saving massive CPU.
  • ๐Ÿง  Resource Intelligence: Automatically downgrades model size if CPU > 80%.
  • ๐Ÿ”„ Zero-Stutter UI: "Double-Buffer" rendering strategy for smooth text updates.

๐Ÿ“š Documentation

Document Description
System Architecture Hybrid Cloud/Edge design and VAD Pipeline.
Getting Started Connect to LiveKit and run Docker worker.
Failure Scenarios Handling High CPU and Network Jitter.
Interview Q&A "Why not OpenAI API?" and "WebSockets vs DataChannels".

๐Ÿ”ง Tech Stack

Component Technology Role
Transport LiveKit (WebRTC) SFU & Signaling.
Inference Faster-Whisper Quantized Speech-to-Text.
Filter WebRTCVAD Voice Activity Detection.
Frontend Next.js 14 Real-time UI.

๐Ÿ‘ค Author

Harshan Aiyappa
Senior Full-Stack Hybrid Engineer
GitHub Profile


๐Ÿ“ License

This project is licensed under the MIT License - see the LICENSE file for details.