9 results for “topic:viseme”
canvas-based talking head model using viseme data
Copies English shape keys to their Japanese counterparts for MMD animations.
AI-powered open-source toolkit for real-time English pronunciation feedback.
Open-source example for integrating ElevenLabs conversational AI with animated avatars using Mascotbot SDK. Features real-time lip sync and natural voice interactions.
Electron-based frame-by-frame viseme audio labeling tool
Playing Audio with lipsync using different Avathar expressions
FastAPI backend for a multilingual AI avatar system with text-to-speech and voice-to-voice translation. Integrates AWS Bedrock, Polly, Transcribe, and S3 for speech synthesis, transcription, and viseme mapping to enable real-time avatar lip-sync across multiple languages.
LipGANs is a text-to-viseme GAN framework that generates realistic mouth movements directly from text, without requiring audio. It maps phonemes → visemes, predicts phoneme durations, and uses per-viseme 3D GANs to synthesize photorealistic frames that can be exported as PNG sequences, GIFs, or MP4 videos.
Frontend implementation of Conflicta. Submitted to ScienceHack 2025, Munich.