Voice AI - Conversation Recorder POC
A web-based AI tool that records conversations, transcribes them using OpenAI's Whisper API, and generates AI-powered summaries using GPT-3.5-turbo.
Features
- ๐ค Voice Recording: Record conversations directly in the browser using the Web Audio API
- ๐ Automatic Transcription: Convert speech to text using OpenAI's Whisper API
- ๐ง AI Summarization: Generate concise summaries using GPT-3.5-turbo
- ๐พ Local Storage: Store recordings in SQLite database
- ๐จ Modern UI: Beautiful interface built with Bulma CSS and Alpine.js
- โก Real-time Processing: Background processing with status indicators
Tech Stack
- Backend: Rust with Actix-web
- Frontend: Tera templates with Bulma CSS and Alpine.js
- Database: SQLite with SQLx
- AI Services: OpenAI API (Whisper + GPT-3.5-turbo)
- Audio: Web Audio API for client-side recording
Prerequisites
- Rust (latest stable version)
- OpenAI API key
Setup
-
Clone and navigate to the project:
cd voiceai -
Set up environment variables:
Create a.envfile in the project root:OPENAI_API_KEY=your_openai_api_key_here DATABASE_URL=sqlite:voiceai.db
-
Install dependencies and run:
cargo run
-
Access the application:
Open your browser and go tohttp://127.0.0.1:8080
Usage
-
Start Recording:
- Enter a title for your recording
- Click "Start Recording" and allow microphone access
- Speak clearly into your microphone
- Click "Stop Recording" when finished
-
View Results:
- The recording will be automatically processed
- Transcription and summary will appear in the recordings list
- Click "View Details" to see the full transcription and summary
-
Manage Recordings:
- All recordings are stored locally in the SQLite database
- Use the "Refresh" button to reload the recordings list
API Endpoints
GET /- Main application pagePOST /api/record- Save a new recordingGET /api/recordings- Get all recordingsGET /api/recordings/{id}- Get specific recording details
Project Structure
voiceai/
โโโ src/
โ โโโ main.rs # Application entry point
โ โโโ handlers.rs # HTTP request handlers
โ โโโ models.rs # Data structures
โ โโโ services.rs # AI service integration
โ โโโ database.rs # Database operations
โโโ templates/
โ โโโ index.html # Main application template
โโโ static/ # Static assets (if any)
โโโ Cargo.toml # Rust dependencies
โโโ README.md # This file
Configuration
Environment Variables
OPENAI_API_KEY: Your OpenAI API key (required)DATABASE_URL: SQLite database URL (default:sqlite:voiceai.db)
OpenAI API Setup
- Sign up for an OpenAI account at https://platform.openai.com
- Generate an API key in your account settings
- Add the API key to your
.envfile
Development
Running in Development Mode
# Run with logging
RUST_LOG=info cargo run
# Run with debug logging
RUST_LOG=debug cargo runBuilding for Production
cargo build --releaseLimitations
This is a POC with the following limitations:
- Audio format is limited to WAV format
- Maximum recording length depends on browser memory limits
- OpenAI API costs apply for transcription and summarization
- No user authentication or multi-user support
- Local storage only (no cloud backup)
Future Enhancements
- User authentication and multi-user support
- Cloud storage integration
- Multiple audio format support
- Real-time transcription
- Custom AI models
- Export functionality
- Mobile app support
Troubleshooting
Common Issues
-
Microphone Access Denied:
- Ensure your browser has permission to access the microphone
- Try refreshing the page and allowing microphone access
-
OpenAI API Errors:
- Verify your API key is correct
- Check your OpenAI account has sufficient credits
- Ensure the API key has access to Whisper and GPT models
-
Database Errors:
- Ensure the application has write permissions in the project directory
- Delete the
voiceai.dbfile to reset the database
Logs
Check the console output for detailed error messages and processing status.
License
This project is for educational and POC purposes. Feel free to modify and extend as needed.